What is a search engine? How do they work? What do they do?
These are just a few of the many questions that people ask me when
I teach HTML Programming. I start off by telling them that a search
engine is a piece of software that has been designed to help find
information that is stored on a personal computer system, a
corporate intranet, or a public network such as the World Wide
Web.
Search engines are essentially massive full-text indexes of web
pages. The quality of the indexes, and how the engines use the
information they contain, is what makes -- or breaks -- the quality
of search results. We're all familiar with back-of-the-book
indexes. They're simply alphabetized lists of the important words
in the book, and the pages on which they appear. Search engine
indexes are similar, but vastly more complex that back-of-the-book
indexes. Although most of us will never want to become experts on
web indexing, knowing even a little bit about how they're built and
used can vastly improve your searching skills.
Okay, so now we have the technical definition of what a search
engine is, let's look what it does. A search engine allows the user
to enter exact keywords or phrases that meet particular criteria
and then the search engine compiles a list of references that meet
those conditions.
Sometimes this information can be overwhelming and you may have
a million or more pages to peruse through, so the user needs to be
specific on what they are looking for, but not too specific as then
the search might result in 'no items found' response.
Search engines find their information using regularly updated
indexes that allow them to retrieve information quickly and
efficiently. Generally when one speaks of a search engine they are
referring to a Web search engine and searches for public domain
information on the Web.
There are other types of search engines though, such as an
enterprise search engine which performs much the same function as
Internet search engines, but targeted to the needs of a particular
group of people rather than the broad public, personal search
engines that search your personal computer, and mobile search
engines, such as Technorati which mines data generally used for
blogs and news information gathering.

In general web sites that claim they are search engines are
actually 'front ends' to search engines. HotBot.com for example is
owned by other companies. While other search engines mine data that
is available in large databases, such as the library, newsgroups or
open source directories like DMOZ.org. The user most likely will
not notice much difference in the Graphical User Interface or GUI
as the differences are in the programming code behind the
scenes.
Search engines function algorithmically, which according to The
American Heritage(r) Dictionary of the English Language, Fourth
Edition, "is a step-by-step problem-solving procedure ... for
solving a problem in a finite number of steps." while Web
Directories are maintained by human editors.
|