Search Engine Guide

Friday, May 18, 2007 Posted by Aman Jain

Search Engines - what are they, and how do they work?Simply put, a search engine is a tool used to help find information on the Internet. No one search engine has examined or indexed the entire World Wide Web. Each only contains a partial subset of what is available and has its own way of gathering, classifying, and displaying this information to the user. Here are a few examples:


Many of the most popular and comprehensive search engines on the Web are indexing engines - also known as crawlers or spiders. They get these names due to their particular way of finding information on the Internet - they have a program (often referred to as a "bot") that scans or "crawls" through Web pages, classifying and indexing them based on a set of pre-determined criteria. The weight given to these criteria, which may include links to your page from other sites, keywords, their positioning on a page and meta-tags, depends upon the individual indexing engine, and makes up their ranking algorithm. The information gathered during the crawling process is placed into a database, called an "index", which is then searched every time you enter a keyword query at their site. When you perform a search at an indexing engine, then, you're not actually querying the entire Web, but the portion that they have examined and included in their database.
Indexing search engines are best to use for hard to find information or very specific data, as they search through a wide and varied database of sites, returning many results. If your query is too broad, however, you risk getting an overwhelming amount of results (numbering in the hundreds of thousands or more!).
Examples of indexing search engines are: Google, AltaVista, and Gigablast.


Directories are categorized groupings of sites, most often compiled and organized by human editors. They're organized into a series of categories and sub-categories, moving from the general to the specific. Each sub-category brings you to a list of additional sub-categories, until finally you reach a list of sites. While the quantity of results are usually much fewer than those returned by an indexing engine, their relevancy and quality are usually much higher.
For ease of use, most directories also have a search feature, which enables you to search through their listings - a word of caution, however: these search functions only search through the directories' categories and listings (i.e. titles, descriptions and URLs as they appear in their database) and not the sites themselves.
Directories are great to use when you don't know a lot about a subject, need help narrowing down a topic, or when you're looking for general information.
Examples of directories are: Yahoo, The Open Directory and LookSmart

Natural language

If you're a beginner to the Internet, or prefer to "ask" your questions (for example, "Why is the sky blue?" "What is the temperature of the sun?" etc.), rather than trying to formulate a keyword query, then a Natural-Language search engine is the way to go. These allow queries to be submitted in the form of a question, and then help you to narrow down your search by clarifying what it is you're looking for. Sometimes, they'll even provide the answer to your question directly on the search results page!
Example of natural language search engine: Subjex and AnswerBus.

"Pay" engines

With the increasing popularity of search engine advertising, paid inclusion and pay-for-placement services abound, and are offered by most major search engines. In a nutshell, these programs require payment in order to have your site listed with them.
Paid Inclusion Paid inclusion services require a fee in order to list a site in their database. It can take the form of a yearly fee for a directory listing, or can be a cost-per-click listing in an index - where the site owner pays every time someone clicks on their link. It could also be a combination of a flat fee and/or cost-per-click payment method (just to make things confusing!).
The most important thing to know about paid inclusion, however, is that placement or ranking within the search engines' results set is not guaranteed - i.e. a site may be included, but it will not receive preferential treatment. Some search engines that have paid inclusion programs still offer a free (slower) submission process, though these are sometimes reserved for non-commercial sites.
Examples of search engines with paid inclusion programs are: Yahoo and Entireweb.

(or pay-per-click, cost-per-click) programs usually take the form of an auction-style environment in which site owners try to outbid each other to get their sites listed higher up in the results. Payment is in the form of a CPC (Cost-Per-Click) whereby the site owner pays a certain amount every time someone clicks on their link.
As pay-for-placement programs are more like advertising than search results, pay-for-placement engines no longer try to attract users to their own sites, but rather distribute their paid results to other search engines, to be displayed as "Sponsored Listings" above or alongside regular results.
Pay-for-placement results are best for when you're searching for something to purchase. The vast majority of listings are for retail sites or online services that are willing to pay for potential customers.
Examples of pay-for-placement programs include: Overture, Google Adwords, and Mamma Classifieds.

Every time you type in a query at a metasearch engine, they search a series of other search sites at the same time, compile their results, and display them either by search engine employed or by integrating them in a uniform manner, eliminating duplicates, and resorting them according to relevance. It's like using multiple search engines, all at the same time.
By using a metasearch engine, you get a snapshot of the top results from a variety of search engines (including a variety of types of search engines), providing you with a good idea of what kind of information is available.
Meta-search engines are tolerant of imprecise search terms or inexact use of operators, and tend to return fewer results, but with a greater degree of relevance. They're best to use when you've got a general search, and don't know where to start - by providing you results from a series of sites, they help you to determine where to continue focusing your efforts (if this proves necessary). They also allow you to compare what kinds of results are available on different engine types (indexes, directories, pay-for-placement, etc), or to verify that you haven't missed a great resource provided by another site, other than your favorite search engine (acting as a backup). Overall, they're a great way to save time.
Examples of metasearch engines are: Mamma, Copernic and Dogpile.
Additional note on metasearch sites: Because metasearch engines do not have their own database of sites, but rather pull their results from multiple outside databases, they cannot accept URL submissions., however, has created two programs in order to overcome this problem experienced by most metasearch sites! Please see Submit Your Site for more information!

As the Web continues to grow, search engines are realizing that they cannot index or categorize the entire Internet. They have also realized that search is a business, and that in order to remain in existence, search engines need to be profitable. As a result, there is an increasing number of partnerships between search engines being made.
Some examples:

At present, MSN does not have its own search engine (though it is building one). MSN search results are currently a mix of Yahoo's indexing engine results, and Overture's paid listings.
Lycos results are provided by Looksmart, Yahoo's Inktomi, and the Open Directory, and they display Google's pay-for-placement Adwords.

AOL search results are powered entirely by Google (an indexing engine), and include Google's pay-for-placement program, Google Adwords's "Sponsored Links" section is actually provided by Google's pay-for-placement program, Google Adwords.

These results, coming from a different source than the one you are actively searching are sometimes differentiated from each other - but sometimes they are not. It is important to always pay attention to these details and to know where your results are coming from For a chart of some of the major search engine relationships, please visit