Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Hits and TopDoc

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


natehoward1 at gmail

Oct 20, 2009, 2:03 PM

Post #1 of 7 (764 views)
Permalink
Hits and TopDoc

This is sort of related to the above question, but I'm trying to update some
(now depricated) Java/Lucene code that I've become aware of once we started
using 2.4.1 (we were previously using 2.3.2):

Hits results = MultiSearcher.search(Query));

int start = currentPage * resultsPerPage;
int stop = (currentPage + 1) * resultsPerPage();

for(int x = start; (x < searchResults.length()) && (x < stop); x++)
{
Document doc = searchResults.doc(x);
// do search post-processing with the Document
}

Results per page is normally small (10ish or so).

I'm having difficulty figuring out how to get TopDocs to replicate this
paging functionality (which the application must maintain).

Thanks in advance!

Nathan


yonik at lucidimagination

Oct 20, 2009, 2:22 PM

Post #2 of 7 (712 views)
Permalink
Re: Hits and TopDoc [In reply to]

On Tue, Oct 20, 2009 at 5:03 PM, Nathan Howard <natehoward1 [at] gmail> wrote:
> This is sort of related to the above question, but I'm trying to update some
> (now depricated) Java/Lucene code that I've become aware of once we started
> using 2.4.1 (we were previously using 2.3.2):
>
> Hits results = MultiSearcher.search(Query));
>
> int start = currentPage * resultsPerPage;
> int stop = (currentPage + 1) * resultsPerPage();
>
> for(int x = start; (x < searchResults.length()) && (x < stop); x++)
> {
>    Document doc = searchResults.doc(x);
>    // do search post-processing with the Document
> }
>
> Results per page is normally small (10ish or so).
>
> I'm having difficulty figuring out how to get TopDocs to replicate this
> paging functionality (which the application must maintain).

You do it tthe same way basically... calculate the biggest doc you
need ("stop"-1 in your code), ask for that many TopDocs, and then
iterate over the page you want, calling
searcher.doc(topDocs.scoreDocs[x].doc)

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


sarowe at syr

Oct 20, 2009, 2:27 PM

Post #3 of 7 (706 views)
Permalink
RE: Hits and TopDoc [In reply to]

Hi Nathan,

On 10/20/2009 at 5:03 PM, Nathan Howard wrote:
> This is sort of related to the above question, but I'm trying to update
> some (now depricated) Java/Lucene code that I've become aware of once we
> started using 2.4.1 (we were previously using 2.3.2):
>
> Hits results = MultiSearcher.search(Query));
>
> int start = currentPage * resultsPerPage;
> int stop = (currentPage + 1) * resultsPerPage();
>
> for(int x = start; (x < searchResults.length()) && (x < stop); x++)
> {
> Document doc = searchResults.doc(x);
> // do search post-processing with the Document
> }
>
> Results per page is normally small (10ish or so).
>
> I'm having difficulty figuring out how to get TopDocs to replicate this
> paging functionality (which the application must maintain).

From <http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Hits.html>:
=====
Deprecated. Hits will be removed in Lucene 3.0.

Instead e. g. TopDocCollector and TopDocs can be used:

TopDocCollector collector = new TopDocCollector(hitsPerPage);
searcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
for (int i = 0; i < hits.length; i++) {
int docId = hits[i].doc;
Document d = searcher.doc(docId);
// do something with current hit
...
=====

Construct the TopDocCollector with your "stop" variable instead of "hitsPerPage", initialize the loop control variable with the value of your "start" variable instead of 0, and you should be good to go.

Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


yonik at lucidimagination

Oct 20, 2009, 2:43 PM

Post #4 of 7 (709 views)
Permalink
Re: Hits and TopDoc [In reply to]

Hmm, yes, I should have thought of quoting the havadoc :-)
The Hits javadoc has been udpated though... we shouldn't be pushing
people toward collectors unless they really need them:

* TopDocs topDocs = searcher.search(query, numHits);
* ScoreDoc[] hits = topDocs.scoreDocs;
* for (int i = 0; i < hits.length; i++) {
* int docId = hits[i].doc;
* Document d = searcher.doc(docId);
* // do something with current hit


-Yonik
http://www.lucidimagination.com



On Tue, Oct 20, 2009 at 5:27 PM, Steven A Rowe <sarowe [at] syr> wrote:
> Hi Nathan,
>
> On 10/20/2009 at 5:03 PM, Nathan Howard wrote:
>> This is sort of related to the above question, but I'm trying to update
>> some (now depricated) Java/Lucene code that I've become aware of once we
>> started using 2.4.1 (we were previously using 2.3.2):
>>
>> Hits results = MultiSearcher.search(Query));
>>
>> int start = currentPage * resultsPerPage;
>> int stop = (currentPage + 1) * resultsPerPage();
>>
>> for(int x = start; (x < searchResults.length()) && (x < stop); x++)
>> {
>>     Document doc = searchResults.doc(x);
>>     // do search post-processing with the Document
>> }
>>
>> Results per page is normally small (10ish or so).
>>
>> I'm having difficulty figuring out how to get TopDocs to replicate this
>> paging functionality (which the application must maintain).
>
> From <http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/Hits.html>:
> =====
> Deprecated. Hits will be removed in Lucene 3.0.
>
> Instead e. g. TopDocCollector and TopDocs can be used:
>
>   TopDocCollector collector = new TopDocCollector(hitsPerPage);
>   searcher.search(query, collector);
>   ScoreDoc[] hits = collector.topDocs().scoreDocs;
>   for (int i = 0; i < hits.length; i++) {
>     int docId = hits[i].doc;
>     Document d = searcher.doc(docId);
>     // do something with current hit
>     ...
> =====
>
> Construct the TopDocCollector with your "stop" variable instead of "hitsPerPage", initialize the loop control variable with the value of your "start" variable instead of 0, and you should be good to go.
>
> Steve
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


sarowe at syr

Oct 20, 2009, 3:00 PM

Post #5 of 7 (706 views)
Permalink
RE: Hits and TopDoc [In reply to]

Hi Yonik,

Hmm, in what version of Hits do you see this updated javadoc? In the 2.9.0 version, the only change in the Hits javadoc from the 2.4.1 version in this section is that it refers to TopScoreDocCollector instead of TopDocCollector:

http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Hits.html

And, of course, Hits has now been removed from trunk as part of the deprecation cleansing ritual.

Steve

On 10/20/2009 at 5:43 PM, Yonik Seeley wrote:
> Hmm, yes, I should have thought of quoting the havadoc :-)
> The Hits javadoc has been udpated though... we shouldn't be pushing
> people toward collectors unless they really need them:
>
> * TopDocs topDocs = searcher.search(query, numHits);
> * ScoreDoc[] hits = topDocs.scoreDocs;
> * for (int i = 0; i < hits.length; i++) {
> * int docId = hits[i].doc;
> * Document d = searcher.doc(docId);
> * // do something with current hit
>
>
> -Yonik
> http://www.lucidimagination.com
>
>
>
> On Tue, Oct 20, 2009 at 5:27 PM, Steven A Rowe <sarowe [at] syr> wrote:
> > Hi Nathan,
> >
> > On 10/20/2009 at 5:03 PM, Nathan Howard wrote:
> >> This is sort of related to the above question, but I'm trying to
> update
> >> some (now depricated) Java/Lucene code that I've become aware of
> once we
> >> started using 2.4.1 (we were previously using 2.3.2):
> >>
> >> Hits results = MultiSearcher.search(Query));
> >>
> >> int start = currentPage * resultsPerPage;
> >> int stop = (currentPage + 1) * resultsPerPage();
> >>
> >> for(int x = start; (x < searchResults.length()) && (x < stop); x++)
> >> {
> >>     Document doc = searchResults.doc(x);
> >>     // do search post-processing with the Document
> >> }
> >>
> >> Results per page is normally small (10ish or so).
> >>
> >> I'm having difficulty figuring out how to get TopDocs to replicate
> this
> >> paging functionality (which the application must maintain).
> >
> > From
> <http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/
> Hits.html>:
> > =====
> > Deprecated. Hits will be removed in Lucene 3.0.
> >
> > Instead e. g. TopDocCollector and TopDocs can be used:
> >
> >   TopDocCollector collector = new TopDocCollector(hitsPerPage);
> >   searcher.search(query, collector);
> >   ScoreDoc[] hits = collector.topDocs().scoreDocs;
> >   for (int i = 0; i < hits.length; i++) {
> >     int docId = hits[i].doc;
> >     Document d = searcher.doc(docId);
> >     // do something with current hit
> >     ...
> > =====
> >
> > Construct the TopDocCollector with your "stop" variable instead of
> "hitsPerPage", initialize the loop control variable with the value of
> your "start" variable instead of 0, and you should be good to go.
> >
> > Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


lucene at mikemccandless

Oct 20, 2009, 3:20 PM

Post #6 of 7 (705 views)
Permalink
Re: Hits and TopDoc [In reply to]

That update to the Hits javadoc didn't make 2.9.0, but will be in
2.9.1 (it's committed to the 2.9.x branch now).

Mike

On Tue, Oct 20, 2009 at 6:00 PM, Steven A Rowe <sarowe [at] syr> wrote:
> Hi Yonik,
>
> Hmm, in what version of Hits do you see this updated javadoc?  In the 2.9.0 version, the only change in the Hits javadoc from the 2.4.1 version in this section is that it refers to TopScoreDocCollector instead of TopDocCollector:
>
> http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Hits.html
>
> And, of course, Hits has now been removed from trunk as part of the deprecation cleansing ritual.
>
> Steve
>
> On 10/20/2009 at 5:43 PM, Yonik Seeley wrote:
>> Hmm, yes, I should have thought of quoting the havadoc :-)
>> The Hits javadoc has been udpated though... we shouldn't be pushing
>> people toward collectors unless they really need them:
>>
>>  *   TopDocs topDocs = searcher.search(query, numHits);
>>  *   ScoreDoc[] hits = topDocs.scoreDocs;
>>  *   for (int i = 0; i < hits.length; i++) {
>>  *     int docId = hits[i].doc;
>>  *     Document d = searcher.doc(docId);
>>  *     // do something with current hit
>>
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>>
>> On Tue, Oct 20, 2009 at 5:27 PM, Steven A Rowe <sarowe [at] syr> wrote:
>> > Hi Nathan,
>> >
>> > On 10/20/2009 at 5:03 PM, Nathan Howard wrote:
>> >> This is sort of related to the above question, but I'm trying to
>> update
>> >> some (now depricated) Java/Lucene code that I've become aware of
>> once we
>> >> started using 2.4.1 (we were previously using 2.3.2):
>> >>
>> >> Hits results = MultiSearcher.search(Query));
>> >>
>> >> int start = currentPage * resultsPerPage;
>> >> int stop = (currentPage + 1) * resultsPerPage();
>> >>
>> >> for(int x = start; (x < searchResults.length()) && (x < stop); x++)
>> >> {
>> >>     Document doc = searchResults.doc(x);
>> >>     // do search post-processing with the Document
>> >> }
>> >>
>> >> Results per page is normally small (10ish or so).
>> >>
>> >> I'm having difficulty figuring out how to get TopDocs to replicate
>> this
>> >> paging functionality (which the application must maintain).
>> >
>> > From
>> <http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/
>> Hits.html>:
>> > =====
>> > Deprecated. Hits will be removed in Lucene 3.0.
>> >
>> > Instead e. g. TopDocCollector and TopDocs can be used:
>> >
>> >   TopDocCollector collector = new TopDocCollector(hitsPerPage);
>> >   searcher.search(query, collector);
>> >   ScoreDoc[] hits = collector.topDocs().scoreDocs;
>> >   for (int i = 0; i < hits.length; i++) {
>> >     int docId = hits[i].doc;
>> >     Document d = searcher.doc(docId);
>> >     // do something with current hit
>> >     ...
>> > =====
>> >
>> > Construct the TopDocCollector with your "stop" variable instead of
>> "hitsPerPage", initialize the loop control variable with the value of
>> your "start" variable instead of 0, and you should be good to go.
>> >
>> > Steve
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


sarowe at syr

Oct 20, 2009, 3:29 PM

Post #7 of 7 (700 views)
Permalink
RE: Hits and TopDoc [In reply to]

Aha, my bad - I looked on ViewVC at the 2.9.0 *tag*, not the 2.9 *branch*, and LUCENE-1955 emails went in one speaker and out the other.

Steve

> -----Original Message-----
> From: Michael McCandless [mailto:lucene [at] mikemccandless]
> Sent: Tuesday, October 20, 2009 6:20 PM
> To: java-user [at] lucene
> Subject: Re: Hits and TopDoc
>
> That update to the Hits javadoc didn't make 2.9.0, but will be in
> 2.9.1 (it's committed to the 2.9.x branch now).
>
> Mike
>
> On Tue, Oct 20, 2009 at 6:00 PM, Steven A Rowe <sarowe [at] syr> wrote:
> > Hi Yonik,
> >
> > Hmm, in what version of Hits do you see this updated javadoc?  In the
> 2.9.0 version, the only change in the Hits javadoc from the 2.4.1
> version in this section is that it refers to TopScoreDocCollector
> instead of TopDocCollector:
> >
> >
> http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/H
> its.html
> >
> > And, of course, Hits has now been removed from trunk as part of the
> deprecation cleansing ritual.
> >
> > Steve
> >
> > On 10/20/2009 at 5:43 PM, Yonik Seeley wrote:
> >> Hmm, yes, I should have thought of quoting the havadoc :-)
> >> The Hits javadoc has been udpated though... we shouldn't be pushing
> >> people toward collectors unless they really need them:
> >>
> >>  *   TopDocs topDocs = searcher.search(query, numHits);
> >>  *   ScoreDoc[] hits = topDocs.scoreDocs;
> >>  *   for (int i = 0; i < hits.length; i++) {
> >>  *     int docId = hits[i].doc;
> >>  *     Document d = searcher.doc(docId);
> >>  *     // do something with current hit
> >>
> >>
> >> -Yonik
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >> On Tue, Oct 20, 2009 at 5:27 PM, Steven A Rowe <sarowe [at] syr>
> wrote:
> >> > Hi Nathan,
> >> >
> >> > On 10/20/2009 at 5:03 PM, Nathan Howard wrote:
> >> >> This is sort of related to the above question, but I'm trying to
> >> update
> >> >> some (now depricated) Java/Lucene code that I've become aware of
> >> once we
> >> >> started using 2.4.1 (we were previously using 2.3.2):
> >> >>
> >> >> Hits results = MultiSearcher.search(Query));
> >> >>
> >> >> int start = currentPage * resultsPerPage;
> >> >> int stop = (currentPage + 1) * resultsPerPage();
> >> >>
> >> >> for(int x = start; (x < searchResults.length()) && (x < stop);
> x++)
> >> >> {
> >> >>     Document doc = searchResults.doc(x);
> >> >>     // do search post-processing with the Document
> >> >> }
> >> >>
> >> >> Results per page is normally small (10ish or so).
> >> >>
> >> >> I'm having difficulty figuring out how to get TopDocs to
> replicate
> >> this
> >> >> paging functionality (which the application must maintain).
> >> >
> >> > From
> >>
> <http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/
> >> Hits.html>:
> >> > =====
> >> > Deprecated. Hits will be removed in Lucene 3.0.
> >> >
> >> > Instead e. g. TopDocCollector and TopDocs can be used:
> >> >
> >> >   TopDocCollector collector = new TopDocCollector(hitsPerPage);
> >> >   searcher.search(query, collector);
> >> >   ScoreDoc[] hits = collector.topDocs().scoreDocs;
> >> >   for (int i = 0; i < hits.length; i++) {
> >> >     int docId = hits[i].doc;
> >> >     Document d = searcher.doc(docId);
> >> >     // do something with current hit
> >> >     ...
> >> > =====
> >> >
> >> > Construct the TopDocCollector with your "stop" variable instead of
> >> "hitsPerPage", initialize the loop control variable with the value
> of
> >> your "start" variable instead of 0, and you should be good to go.
> >> >
> >> > Steve
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> > For additional commands, e-mail: java-user-help [at] lucene
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.