Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
Re: lucene algorithm ?
 

Index | Next | Previous | View Flat


teddyyyy123 at gmail

Apr 25, 2012, 2:20 PM


Views: 499
Permalink
Re: lucene algorithm ? [In reply to]

additionally, anybody knows roughly (of course the details are a secret,
but I guess the main ideas should be
common enough these days) how google does fast ranking in cases of
multi-term queries with AND ?
(if their postings are sorted by PageRank order, then it's understandable
that a single term query would quickly return the top-k, but if it's
multi-term, they would have to traverse the entire lists to find the
insersection set, because the lists are not sorted by docId, as in the
Lucene paper case)



On Wed, Apr 25, 2012 at 2:13 PM, Yang <teddyyyy123 [at] gmail> wrote:

> I read the paper by Doug "Space optimizations for total ranking",
>
> since it was written a long time ago, I wonder what algorithms lucene uses
> (regarding postings list traversal and score calculation, ranking)
>
>
> particularly the total ranking algorithm described there needs to traverse
> down the entire postings list for all the query terms,
> so in case of very common query terms like "yellow dog", either of the 2
> terms may have a very very long postings list in case of web search,
> are they all really traversed in current lucene/Solr ? or any heuristics
> to truncate the list are actually employed?
>
> in the case of returning top-k results, I can understand that partitioning
> the postings list into multiple machines, and then combining the top-k
> from each would work,
> but if we are required to return "the 100th result page", i.e. results
> ranked from 990--1000th, then each partition would still have to find out
> the top 1000, so
> partitioning would not help much.
>
>
> overall, is there any up-to-date detailed docs on the internal algorithms
> of lucene?
>
> Thanks a lot
> Yang
>

Subject User Time
lucene algorithm ? teddyyyy123 at gmail Apr 25, 2012, 2:13 PM
    Re: lucene algorithm ? teddyyyy123 at gmail Apr 25, 2012, 2:20 PM
    Re: lucene algorithm ? ralf.heyde at gmx Apr 26, 2012, 12:17 AM
        Re: lucene algorithm ? teddyyyy123 at gmail Apr 27, 2012, 12:45 PM
    RE: lucene algorithm ? uwe at thetaphi Apr 26, 2012, 12:49 AM
        Re: lucene algorithm ? ab at getopt Apr 26, 2012, 7:21 AM
    Re: lucene algorithm ? fancyerii at gmail Apr 27, 2012, 12:25 AM
    Re: lucene algorithm ? teddyyyy123 at gmail Apr 27, 2012, 12:57 PM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.