designer seven (designer_7@yahoo.com)
Thu, 1 Apr 1999 11:17:28 -0800 (PST)
Hmm.. regardless of how they are doing it, I don't
think they are doing it well.... I don't think they
give priority to where certain search text is
located... and other "accuracy" techniques.... so, not
too interesting.. but if you're looking for a search
engine algorithm with source code... check out Htdig...
they have source available.. and they seem to do some
"fuzzy logic" techniques too... would be interesting to
read....
On the bright side... I found Digital's paper on the
Burrows-Wheeler compression algo.. used in bzip .. that
was interesting... I'd be interested to see if
improvements (significant) can be made to help optimize
that algo... what can I say, my math background is
getting to me....
D.
--- mgraffam@idsi.net wrote:
> On Thu, 1 Apr 1999, Wes Bauske wrote:
>
> > I thought they processed HTML into a word list for
> each
> > document. Probably use a sorted binary search tree
> against all
> > words, and also maybe eliminate uninteresting
words
> to
> > reduce size.
>
> I agree; I figure they have to use a sorted search
> method.. and probably
> are really optimized insertion sort hand coded in
> assembler :)
>
> > Then all you need is fast disk and lots of memory.
>
> Yeah.. seems to me that some sort of fast shared
disk
> method would be best
> here.. then you have one machine sorting in new
> pages, and weeding out the
> old, while having other machines search for the
user.
>
> Michael J. Graffam (mgraffam@idsi.net)
> "86% of conspiracy theories have some basis in
> truth... but, oddly enough,
> it's that last 14% that usually gets you killed."
> --Talas
>
(http://cadvantage.com/~algaeman/conspiracy/public.htm)
>
> --
> To unsubscribe: send e-mail to
> axp-list-request@redhat.com with
> 'unsubscribe' as the subject. Do not send it to
> axp-list@redhat.com
>
>
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
-- To unsubscribe: send e-mail to axp-list-request@redhat.com with 'unsubscribe' as the subject. Do not send it to axp-list@redhat.com
This archive was generated by hypermail 2.0b3 on Thu Apr 01 1999 - 11:43:39 PST