Re: altavista


mgraffam@idsi.net
Thu, 1 Apr 1999 13:26:59 -0500 (EST)


On Thu, 1 Apr 1999, Wes Bauske wrote:

> I thought they processed HTML into a word list for each
> document. Probably use a sorted binary search tree against all
> words, and also maybe eliminate uninteresting words to
> reduce size.

I agree; I figure they have to use a sorted search method.. and probably
are really optimized insertion sort hand coded in assembler :)

> Then all you need is fast disk and lots of memory.

Yeah.. seems to me that some sort of fast shared disk method would be best
here.. then you have one machine sorting in new pages, and weeding out the
old, while having other machines search for the user.

Michael J. Graffam (mgraffam@idsi.net)
"86% of conspiracy theories have some basis in truth... but, oddly enough,
it's that last 14% that usually gets you killed."
    --Talas (http://cadvantage.com/~algaeman/conspiracy/public.htm)

-- 
To unsubscribe: send e-mail to axp-list-request@redhat.com with
'unsubscribe' as the subject.  Do not send it to axp-list@redhat.com



This archive was generated by hypermail 2.0b3 on Thu Apr 01 1999 - 11:43:39 PST