Faith and Search Engine Statistics

WILKINS@hws.edu
Wed, 13 May 1998 22:02:25 -0500 (EST)


I was doing some research today with back copies of the Wall Street Journal
when I came across the following article [only the citation and abstract are
being included for the sake of brevity]:

Wall Street Journal
April 3, 1998, Friday
SECTION: Section B; Page 1, Column 3
HEADLINE: WEB'S VASTNESS FOILS EVEN BEST SEARCH ENGINES
BYLINE: Thomas E. Weber
ABSTRACT:
New study by the journal Science concludes that even the most
thorough search engines manage to find only about a third of
the pages on the World Wide Web; other popular search sites cover
less than 10% of the Web; the problem may well get
worse as millions of pages are added to the Web each year; chart (M)

The article opens with the following statement:

"Take a massive phone book and tear out most of the pages. What you have is a
lot like the listiings provided by World Wide Web search engines."

The study being discussed in this article was conducted by C. Lee Giles and
other scientists at NEC's Corp.'s research lab in Princeton, NJ. Giles is
quoted as saying:

"I don't think people realize how little coverage of the Web search engines
provide ... I was quite surprised."

He was surprised? He is a scientist working at NEC and he is just now figuring
out how inadequate most (if not all) search engines truly are! Give me a
break... what a lame quote.

As someone who spends a good portion of their day helping students with
research I can fully attest to the inadequacy of search engines... what
amazes me is that people still continue to put their faith in them as if
they hold the key to the universe. Search engines can yield useful results if
a person has the ability to construct a proper search statement (something
which is beyond most people, especially the undergrads I work with). Most
people simply don't take the time to learn the syntactic pecularities of each
search engine or have a full understanding of Boolean logic. They simply type
in one or two words and expect the perfect information to appear magically
before them. OOOPS! I've digressed from my topic and am now off on a
rant -- sorry :-)
[Note: I have just spent the last three hours doing reference duty... the
students I helped tonight seemed especially clueless... this is the source of
my rant...]

Okay... back to the topic. According to the WSJ (which is citing a study that
appeared in the magazine Science), here are the latest stats on search engines:

SITE ESTIMATED COVERAGE
HotBot 34%
AltaVista 28%
Northern Light 20%
Excite 14%
Infoseek 10%
Lycos 3%

As I reflect upon these numbers, I have two thoughts:

1) I don't care if HotBot covers more, I still prefer AltaVista and I still
think that the results I get using AltaVista are more relevant (on the whole)
than those I get when I use HotBot

2) Notice that Yahoo doesn't even count? This is something that I try to get
across to students all the time... Yahoo isn't really a search engine! This
article explains this fact quite nicely -- it is a searchable directory and NOT
an automated engine!

Ta!
Janie

"We are drowning in information but starved for knowledge"
-- John Naisbitt

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Janie L. Wilkins
Reference/ILL Librarian
Hobart and William Smith Colleges
Geneva, NY 14456

voice: (315) 781-3552
e-mail: wilkins@hws.edu
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~