Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
Re: two fields, the first important than the second
 

Index | Next | Previous | View Flat


fancyerii at gmail

Apr 26, 2012, 8:17 PM


Views: 544
Permalink
Re: two fields, the first important than the second [In reply to]

sorry for some typos.
original query +(title:hello desc:hello) +(title:world desc:world)
boosted one +(title:hello^2 desc:hello) +(title:world^2 desc:world)
last one +(title:hello desc:hello) +(title:world desc:hello)
(+title:hello +title:world)^10 (+desc:hello +desc:world)^5

the example has two terms. if it has more terms, the query will become too
complicated.

On Fri, Apr 27, 2012 at 11:12 AM, Li Li <fancyerii [at] gmail> wrote:

> you should describe your ranking strategy more precisely.
> if the query has 2 terms, "hello" and "world" for example, and your
> search fields are title and description. There are many possible
> combinations.
> Here is my understanding.
> Both terms should occur in title or desc
> query may be +(title:hello desc:hello) +(title:world desc:hello)
> the problem is that we need title weight more than desc, so may be we
> rewrite it to
> +(title:hello^2 desc:hello) +(title:world^2 desc:hello)
> but we consider this two scenarios:
> 1. hello hit only in title, world hit only in desc
> 2. hello and world both hit in desc
> because title is boosted, so 1 has more score than 2.
> But we may think 2 is better than 1 because hello world is a phrase.
> But we don't want to use phrase query because it's too strict that the
> recall can meet our needs.
> Our solution is modify lucene so boolean scorer can tell us which term
> is matched. then we use our own collector to boost scenario 1. This
> solution need modify lucene(I have posted a mail and you can patch your
> DisjunctionSumScorer with
> https://issues.apache.org/jira/browse/LUCENE-2686)
> Another solution I can come up with is using complicated query:
> +(title:hello desc:hello) +(title:world desc:hello)
> (+title:hello +title:world)^10 (+desc:hello +desc:world)^5
> The must occurrence condition is the same as before. but if hello world
> are all in title, we give it a boost. similarly, if hello world are all in
> desc, we also boost it.
>
>
>
> On Fri, Apr 27, 2012 at 3:12 AM, Akos Tajti <akos.tajti [at] gmail> wrote:
>
>> Dear List,
>>
>> we've been struggling the following problem for a while:
>> we have two fields: title and description. Title is generated from short
>> summaries while description is generated fromlong texts. We want to search
>> on both fields at the same time but we'd like to get all documents in
>> which
>> the title matches the search term before all others. For multi term
>> queries
>> we want to achieve the following: all documents that contain all terms in
>> their title must come before every other document, no matter how many
>> times
>> the description matches the query. Is there a simple way to achieve this?
>>
>> Thanks in advance,
>> Ákos Tajti
>>
>
>

Subject User Time
two fields, the first important than the second akos.tajti at gmail Apr 26, 2012, 12:12 PM
    Re: two fields, the first important than the second jakedsouza88 at gmail Apr 26, 2012, 12:20 PM
    Re: two fields, the first important than the second akos.tajti at gmail Apr 26, 2012, 12:30 PM
    Re: two fields, the first important than the second ian.lea at gmail Apr 26, 2012, 12:36 PM
    Re: two fields, the first important than the second fancyerii at gmail Apr 26, 2012, 8:12 PM
    Re: two fields, the first important than the second fancyerii at gmail Apr 26, 2012, 8:17 PM
    Re: two fields, the first important than the second akos.tajti at gmail Apr 26, 2012, 11:59 PM
    Re: two fields, the first important than the second fancyerii at gmail Apr 27, 2012, 12:40 AM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.