From: Sandor Spruit (aspruit@acm.org)
Date: Fri Jun 30 2000 - 01:45:09 PDT
Kragen,
On Thursday, June 29, 2000, 1:08:38 PM, you wrote:
Kragen> Nicolas Popp writes:
>> I am not sure I understand your point. All I am saying is the type of
>> distributed search that lets the destination control the answer to a query
>> based on inconsistent criteria (the one that the destination decides upon)
>> is likely to produce low quality results. I am not saying that all search
>> engines do.
>>
>> Google is the perfect example of a centralized search engine. . . .
Kragen> Google summarizes other people's assessments of page quality
Kragen> --- as expressed by whether or not those other people choose
Kragen> to link to the page --- to determine its rank. While the
Kragen> search engine itself and the PageRank computation are
Kragen> centralized, the decisions that determine whether a page is
Kragen> valuable or not are distributed over the whole knot of the Web
Kragen> bow-tie.
[major snip]
[Note: I haven't really read all the stuff about Gnutella yet; just
got a vague impression of what it does]
What I was wondering while I read this: why don't we have a whole
bunch of distributed search engines yet ? Isn't it *weird* that an
activity that lends itself so well for both distributed and parallel
processing is still almost exclusively handled by some handful of
commercial companies ?
Wouldn't searching be much easier if there would be, say, an Apache
indexing module that could dynamically query other known servers in
its environment ? Some search engines seem to work surprisingly well
on huge numbers of pages. What if dozens of people would apply the
same technology on their own servers and have them communicate ?
I know or at least suspect from what I've read that this is basically
part of what Gnutella - or Freenet for that matter - is all about. But
why hasn't this happened before, given the fact that *all* ingredients
seem to have been around for ages: free webserver, search engines that
can be installed locally. All that is missing is a decent protocol (I
understand Gnutella's not great) that efficiently sends both queries
and results around without causing too much traffic. Not that this is
an easy thing to do, but still ...
Sandor
-- ir A.G.L. Spruit, Utrecht University, the Netherlands Institute of information and computing sciences "There is a bit of magic in everything, and then some loss to even things out" (from: Lou Reed, "Magic and Loss")
This archive was generated by hypermail 2b29 : Fri Jun 30 2000 - 01:47:01 PDT