Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: kinosearch: discuss

Scorer::doc

 

 

kinosearch discuss RSS feed   Index | Next | Previous | View Threaded


sprout at cpan

Mar 1, 2008, 1:09 PM

Post #1 of 3 (1966 views)
Permalink
Scorer::doc

When is a Scorer¢s doc method called? The WildCardQuery recipe
implements this method, but, as far as I have been able to ascertain,
the ¡next¢ and ¡tally¢ methods are called one after the other during a
search, while ¡doc¢ is not called at all. (???)

In the example in the cookbook, the first call to ¡next¢ returns the
first doc number, while ¡doc¢ returns the doc number that the *next*
call to ¡next¢ will return. This doesn¢t make sense to me. Is there a
mistake in the cookbook?


_______________________________________________
KinoSearch mailing list
KinoSearch [at] rectangular
http://www.rectangular.com/mailman/listinfo/kinosearch


marvin at rectangular

Mar 1, 2008, 2:26 PM

Post #2 of 3 (1749 views)
Permalink
Re: Scorer::doc [In reply to]

On Mar 1, 2008, at 1:09 PM, Father Chrysostomos wrote:

> When is a Scorer’s doc method called?

Scorer::doc really means "get_doc_num". It is supposed to return the
same document number that next() just returned.

while ( my $doc_num = $scorer->next ) {
die "broken scorer" unless $scorer->get_doc_num == $doc_num;
}

The method name came from Lucene and I've been thinking of changing it
for the sake of stylistic consistency with all the other KS getters.
This bug report cements it -- because if you'd known that it was
get_doc_num, it would have been plain that the WildCardQuery recipe
was incorrect.

[What will soon be] Scorer::get_doc_num gets called all the time
internally. For instance, when one subscorer within an ORScorer has a
very high document number, it needs to stay on that document number
until all the other subscorers catch up, over multiple calls to
ORScorer_Next.

It used to be the case that Scorer_Next returned a boolean. It was
only after I implemented Nathan's suggestion of having document
numbers start at 1 that it was possible to have Scorer_Next() return
the document number and have that double as a boolean.

> In the example in the cookbook, the first call to ‘next’ returns the
> first doc number, while ‘doc’ returns the doc number that the *next*
> call to ‘next’ will return. This doesn’t make sense to me. Is there
> a mistake in the cookbook?

Yeah, I botched it, sorry. Fixed by r3093, patch pasted below.

(I'd really like a way of running code samples in documentation. And
a pony.)

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


Modified: trunk/perl/lib/KinoSearch/Docs/Cookbook/WildCardQuery.pod
===================================================================
--- trunk/perl/lib/KinoSearch/Docs/Cookbook/WildCardQuery.pod
2008-03-01 21:04:02 UTC (rev 3092)
+++ trunk/perl/lib/KinoSearch/Docs/Cookbook/WildCardQuery.pod
2008-03-01 22:34:50 UTC (rev 3093)
@@ -185,7 +185,7 @@
my @doc_nums = sort { $a <=> $b } keys %all_doc_nums;
$doc_nums{$id} = \@doc_nums;

- $tick{$id} = 0;
+ $tick{$id} = -1;
$tally{$id} = KinoSearch::Search::Tally->new;
$tally{$id}->set_score(1.0); # fixed score of 1.0

@@ -208,8 +208,9 @@
my $self = shift;
my $id = refaddr($self);
my $doc_nums = $doc_nums{$id};
- return 0 if $tick{$id} >= scalar @$doc_nums;
- return $doc_nums[ $tick{$id}++ ];
+ my $tick = ++$tick{$id};
+ return 0 if $tick >= scalar @$doc_nums;
+ return $doc_nums[$tick];
}

next() advances the Scorer to the next valid matching doc. In this
example,


_______________________________________________
KinoSearch mailing list
KinoSearch [at] rectangular
http://www.rectangular.com/mailman/listinfo/kinosearch


sprout at cpan

Mar 1, 2008, 4:16 PM

Post #3 of 3 (1749 views)
Permalink
Re: Scorer::doc [In reply to]

On Mar 1, 2008, at 2:26 PM, Marvin Humphrey wrote:

> I'd really like a way of running code samples in documentation.

In this particular case, since you¢ve kept your indentation
consistent, something like this should work:


use Test::More tests => ...;

open my $pod_fh, "lib/KinoSearch/Docs/CookBook/WildCardQuery.pod" or
die $!;
my $code;
while (<$pod_fh>) {
$code .= $_ if /^ /;
}
eval $code;

# tests .....
etc.

kinosearch discuss RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.