fancyerii at gmail
Apr 26, 2012, 8:17 PM
sorry for some typos.
Re: two fields, the first important than the second
[In reply to]
original query +(title:hello desc:hello) +(title:world desc:world)
boosted one +(title:hello^2 desc:hello) +(title:world^2 desc:world)
last one +(title:hello desc:hello) +(title:world desc:hello)
(+title:hello +title:world)^10 (+desc:hello +desc:world)^5
the example has two terms. if it has more terms, the query will become too
On Fri, Apr 27, 2012 at 11:12 AM, Li Li <fancyerii [at] gmail> wrote:
> you should describe your ranking strategy more precisely.
> if the query has 2 terms, "hello" and "world" for example, and your
> search fields are title and description. There are many possible
> Here is my understanding.
> Both terms should occur in title or desc
> query may be +(title:hello desc:hello) +(title:world desc:hello)
> the problem is that we need title weight more than desc, so may be we
> rewrite it to
> +(title:hello^2 desc:hello) +(title:world^2 desc:hello)
> but we consider this two scenarios:
> 1. hello hit only in title, world hit only in desc
> 2. hello and world both hit in desc
> because title is boosted, so 1 has more score than 2.
> But we may think 2 is better than 1 because hello world is a phrase.
> But we don't want to use phrase query because it's too strict that the
> recall can meet our needs.
> Our solution is modify lucene so boolean scorer can tell us which term
> is matched. then we use our own collector to boost scenario 1. This
> solution need modify lucene(I have posted a mail and you can patch your
> DisjunctionSumScorer with
> Another solution I can come up with is using complicated query:
> +(title:hello desc:hello) +(title:world desc:hello)
> (+title:hello +title:world)^10 (+desc:hello +desc:world)^5
> The must occurrence condition is the same as before. but if hello world
> are all in title, we give it a boost. similarly, if hello world are all in
> desc, we also boost it.
> On Fri, Apr 27, 2012 at 3:12 AM, Akos Tajti <akos.tajti [at] gmail> wrote:
>> Dear List,
>> we've been struggling the following problem for a while:
>> we have two fields: title and description. Title is generated from short
>> summaries while description is generated fromlong texts. We want to search
>> on both fields at the same time but we'd like to get all documents in
>> the title matches the search term before all others. For multi term
>> we want to achieve the following: all documents that contain all terms in
>> their title must come before every other document, no matter how many
>> the description matches the query. Is there a simple way to achieve this?
>> Thanks in advance,
>> Ákos Tajti