Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Coord issue

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


Pascal.Chollet at local

Aug 8, 2012, 5:47 AM

Post #1 of 2 (374 views)
Permalink
Coord issue

Hi

We are using Solr 4 with a custom query tree. For boolean queries, the score should not just be the sum of all sub-scores, but instead it should be the mean value of all the sub-scores, which is equal to dividing the sum of the sub-scores by the number of sub-scorers.

To achieve this, I wanted to use the coord factor. So I'm using a custom similarity with the following method:
@Override
public float coord(int overlap, int maxOverlap) {
return overlap == 0 ? 0 : 1.0f / overlap;
}

After some debugging I found out, that the coord factor gets multiplied twice with the score. Once in the BooleanScorer2:
@Override
public float score() throws IOException {
coordinator.nrMatchers = 0;
float sum = countingSumScorer.score();
return sum * coordinator.coordFactors[coordinator.nrMatchers];
}

and then also in ConjunctionScorer:
@Override
public float score() throws IOException {
float sum = 0.0f;
for (int i = 0; i < scorers.length; i++) {
sum += scorers[i].score();
}
return sum * coord;
}

However, if I run the query with debugQuery=on to get the explanation, the score in the explanation gets multiplied only once with the coord factor, and thus the final score is not the same as in the result list.

To me it looks like multiplying the score twice with the coord factor is a bug. Can someone confirm that or am I wrong?

Pascal



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


rcmuir at gmail

Aug 8, 2012, 6:43 AM

Post #2 of 2 (364 views)
Permalink
Re: Coord issue [In reply to]

Hi Pascal!

Thanks for reporting this. I'm pretty positive its a bug in
BooleanScorer2 (i think in dualConjunctionSumScorer method). I
modified TestBoolean2.testRandomQueries to:
1. sometimes add phrase queries (so we don't get the optimized
ConjunctionTermScorer, but BooleanScorer2 for conjunctions)
2. sometimes use a Similarity with a coord of return overlap /
((float)maxOverlap - 1);

The problem is this would never be caught with the default similarity,
since due to its coord implementation it would just multiple "1" into
the score twice.
But if you customize coord, then its wrong.

I've opened https://issues.apache.org/jira/browse/LUCENE-4297

On Wed, Aug 8, 2012 at 8:47 AM, Pascal Chollet <Pascal.Chollet [at] local> wrote:
> Hi
>
> We are using Solr 4 with a custom query tree. For boolean queries, the score should not just be the sum of all sub-scores, but instead it should be the mean value of all the sub-scores, which is equal to dividing the sum of the sub-scores by the number of sub-scorers.
>
> To achieve this, I wanted to use the coord factor. So I'm using a custom similarity with the following method:
> @Override
> public float coord(int overlap, int maxOverlap) {
> return overlap == 0 ? 0 : 1.0f / overlap;
> }
>
> After some debugging I found out, that the coord factor gets multiplied twice with the score. Once in the BooleanScorer2:
> @Override
> public float score() throws IOException {
> coordinator.nrMatchers = 0;
> float sum = countingSumScorer.score();
> return sum * coordinator.coordFactors[coordinator.nrMatchers];
> }
>
> and then also in ConjunctionScorer:
> @Override
> public float score() throws IOException {
> float sum = 0.0f;
> for (int i = 0; i < scorers.length; i++) {
> sum += scorers[i].score();
> }
> return sum * coord;
> }
>
> However, if I run the query with debugQuery=on to get the explanation, the score in the explanation gets multiplied only once with the coord factor, and thus the final score is not the same as in the result list.
>
> To me it looks like multiplying the score twice with the coord factor is a bug. Can someone confirm that or am I wrong?
>
> Pascal
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>



--
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.