Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
Re: lucene algorithm ?
 

Index | Next | Previous | View Flat


ab at getopt

Apr 26, 2012, 7:21 AM


Views: 506
Permalink
Re: lucene algorithm ? [In reply to]

On 26/04/2012 09:49, Uwe Schindler wrote:

> There are possibilities to truncate those lists, but this is not implemented
> in Lucene core. The main problem with Lucene's segmented index structure is,
> that you cannot early exit, because the very last document in the posting
> list could be one with a very high score. Techniques like index pruning or
> index sorting can optimize all this, but need index preprocessing and loose
> updatability of the index while in use.
>
> The Top-N result collection is optimized in Lucene (item #3), to throw away
> all collected documents, where the score is never be able to get into the
> top-n priority queue (too small score, once the queue is full). But the
> score has to be calculated.

This paper presents yet another option: keep a value that represents
max. impact of postings in a block and then skip whole blocks if the max
impact is too low to get docs within the block in the queue:

http://cis.poly.edu/suel/papers/bmw.pdf

Implementing this in Lucene would require changing the iterators so that
they can take a threshold value, i.e. the current lowest score in the
queue, so that nextDoc() and advance() could skip whole blocks when
their max impact is lower than the current lowest score.

--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Subject User Time
lucene algorithm ? teddyyyy123 at gmail Apr 25, 2012, 2:13 PM
    Re: lucene algorithm ? teddyyyy123 at gmail Apr 25, 2012, 2:20 PM
    Re: lucene algorithm ? ralf.heyde at gmx Apr 26, 2012, 12:17 AM
        Re: lucene algorithm ? teddyyyy123 at gmail Apr 27, 2012, 12:45 PM
    RE: lucene algorithm ? uwe at thetaphi Apr 26, 2012, 12:49 AM
        Re: lucene algorithm ? ab at getopt Apr 26, 2012, 7:21 AM
    Re: lucene algorithm ? fancyerii at gmail Apr 27, 2012, 12:25 AM
    Re: lucene algorithm ? teddyyyy123 at gmail Apr 27, 2012, 12:57 PM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.