Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

pruning package- question about termpositions && skipTo

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


zpvie at yahoo

Aug 14, 2012, 6:53 AM

Post #1 of 2 (161 views)
Permalink
pruning package- question about termpositions && skipTo

Hi to all,

In pruning package, for pruneAllPositions(TermPositions termPositions, Term
t) methos it is said that :

"termPositions - positioned term positions. Implementations MUST NOT advance
this by calling TermPositions methods that advance either the position
pointer (next, skipTo) or term pointer (seek)."

Why??

Why do I need to do skipTo :

I added a new pruning class with public void
initPositionsTerm(TermPositions tp, Term t, ScoreDoc[] sdoc) method. I
needed it because my ScoreDoc[] is generated with different external
parameters based on lucene basic results. And then in initPositionsTerm
method, instead of letting method to get docs like in other classes, it is
just equal to sdocs. For example, for a term x, sdocs = {42813, 123472,
22477, 76995, 47086, 106424, 68570, 26708, 49740, 116472}, sorted docs =
{22477, 26708, 42813, 47086, ...}. I just want to keep these postings in my
pruned index.

The problem is that when I call pruneAllPositions as it is, it returns me
only {22477, 26708, *107377*} After 28118 super.next() is false in
PruningTermPositions.next(). So it returns never true for
(termPositions.doc() == docs[docsPos].doc) with docIds > 28118.( I have no
idea where it comes 107377, it is not even in my docs). However, in
pruneAllPositions when I check termpositions with the code above I have all
docids that I need in it. That is why I wonder why I can not do skipTo and
why that happens with termspositions ??????

while(termPositions.next())
{
System.out.println(termPositions.doc() );
}

Thanks in advance,
Best Regards



--
View this message in context: http://lucene.472066.n3.nabble.com/pruning-package-question-about-termpositions-skipTo-tp4001160.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


zpvie at yahoo

Aug 22, 2012, 7:27 AM

Post #2 of 2 (137 views)
Permalink
Re: pruning package- question about termpositions && skipTo [In reply to]

Hi to all,

I found the problem and the solution. In PruningReader
super.getSequentialSubReaders(); is used. After 28118 super.next() is false
because it is a subreader for a segment and indexreader.maxDoc() is equal to
28118 for that segment. In pruneAllPositions, instead of comparing
termpostions.doc to docid, I compared
in.document(termPositions.doc()).getField("docid").stringValue() to docid.

It happened because of my custom initPositionsTerm method. (public void
initPositionsTerm(TermPositions tp, Term t, *ScoreDoc[] sdoc*) ). There is
no problem with other pruning policies.

DocID ****** termPositions.doc()
22477 ******** 22477
26708 ******** 26708
42813 ******** 14093
47086 ******** 18366
49740 ******** 21020
68570 ******** 11760
76995 ******** 20185
106424 ******** 21524
116472 ******** 502
123472 ******** 1992

Best Regards





--
View this message in context: http://lucene.472066.n3.nabble.com/pruning-package-question-about-termpositions-skipTo-tp4001160p4002656.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.