Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

weightage of each word according to precedence in document

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


4azfriend at gmail

Jan 24, 2012, 9:08 AM

Post #1 of 7 (385 views)
Permalink
weightage of each word according to precedence in document

Hi



how can we assign custom score for each token/word.



For Ex

I have document



1 pqrst uvwx abcd

2 abcd pqrst uvwx

3 pqrst uvwx lmn

4 pqrst uvwx lmn abcd

5 pqrst abcd uvwx lmn



*Now i m searching data ---> abcd pqrst*

So it should give more weightage score to 2nd document then 1st document



So i want is

*document 1 :---* *pqrst *has more *weight * then *uvwx *word and *then
abcd *word

*document 2* *:---* *abcd *has more *weight * then *pqrst* word
and *then uvwx
*word


4azfriend at gmail

Jan 24, 2012, 9:12 AM

Post #2 of 7 (365 views)
Permalink
weightage of each word according to precedence in document [In reply to]

Hi



how can we assign custom score for each token/word.



For Ex

I have document



1 pqrst uvwx abcd

2 abcd pqrst uvwx

3 pqrst uvwx lmn

4 pqrst uvwx lmn abcd

5 pqrst abcd uvwx lmn



*Now i m searching data ---> abcd pqrst*

So it should give more weightage score to 2nd document then 1st document



So i want is

*document 1 :---* *pqrst *has more *weight * then *uvwx *word and *then
abcd *word

*document 2* *:---* *abcd *has more *weight * then *pqrst* word
and *then uvwx
*word

Thanx you


ian.lea at gmail

Jan 25, 2012, 7:08 AM

Post #3 of 7 (356 views)
Permalink
Re: weightage of each word according to precedence in document [In reply to]

If you want particular search terms to be more important than others
you can use boosting. See
http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting a
Term

If you want the order of matched terms to matter, see PhraseQuery or
SpanQuery. The latter is more flexible. See
http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for a
good writeup.

And you can of course use combinations of everything.


--
Ian.



On Tue, Jan 24, 2012 at 5:08 PM, A Z <4azfriend [at] gmail> wrote:
> Hi
>
>
>
> how can we assign custom score for each token/word.
>
>
>
> For Ex
>
> I have document
>
>
>
> 1    pqrst uvwx abcd
>
> 2    abcd pqrst uvwx
>
> 3    pqrst uvwx lmn
>
> 4    pqrst uvwx lmn abcd
>
> 5    pqrst abcd uvwx lmn
>
>
>
> *Now i m searching data ---> abcd pqrst*
>
> So it should give more weightage score to 2nd document then 1st document
>
>
>
> So i want is
>
> *document 1 :---*    *pqrst *has more *weight * then   *uvwx *word and *then
>  abcd *word
>
> *document 2* *:---*    *abcd *has more *weight * then   *pqrst*  word
> and *then  uvwx
> *word

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


4azfriend at gmail

Jan 27, 2012, 8:44 PM

Post #4 of 7 (355 views)
Permalink
Re: weightage of each word according to precedence in document [In reply to]

Hi lan

thanks for your reply.

when i boosting each term while searching like abcd is boost with boost
factor of 10 and pqrst boost with boost factor of 5.
then also it gives same score for documents

*Query content:abcd^10.0 content:pqrst^5.0*


title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score ->0.40883923

title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score ->0.40883923

title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score ->0.40883923

title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
->0.40883923

title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
->0.40883923
Thanks

On Wed, Jan 25, 2012 at 8:38 PM, Ian Lea <ian.lea [at] gmail> wrote:

> If you want particular search terms to be more important than others
> you can use boosting. See
> http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting a
> Term
>
> If you want the order of matched terms to matter, see PhraseQuery or
> SpanQuery. The latter is more flexible. See
> http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for a
> good writeup.
>
> And you can of course use combinations of everything.
>
>
> --
> Ian.
>
>
>
> On Tue, Jan 24, 2012 at 5:08 PM, A Z <4azfriend [at] gmail> wrote:
> > Hi
> >
> >
> >
> > how can we assign custom score for each token/word.
> >
> >
> >
> > For Ex
> >
> > I have document
> >
> >
> >
> > 1 pqrst uvwx abcd
> >
> > 2 abcd pqrst uvwx
> >
> > 3 pqrst uvwx lmn
> >
> > 4 pqrst uvwx lmn abcd
> >
> > 5 pqrst abcd uvwx lmn
> >
> >
> >
> > *Now i m searching data ---> abcd pqrst*
> >
> > So it should give more weightage score to 2nd document then 1st document
> >
> >
> >
> > So i want is
> >
> > *document 1 :---* *pqrst *has more *weight * then *uvwx *word and
> *then
> > abcd *word
> >
> > *document 2* *:---* *abcd *has more *weight * then *pqrst* word
> > and *then uvwx
> > *word
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


ian.lea at gmail

Jan 30, 2012, 1:29 AM

Post #5 of 7 (352 views)
Permalink
Re: weightage of each word according to precedence in document [In reply to]

They all give exactly the same score, even the 3rd doc which doesn't
contain abcd at all? Surprising. What does searcher.explain() say?
Is this a simple search with default Similarity or is there stuff
you're not telling us?

--
Ian.


On Sat, Jan 28, 2012 at 4:44 AM, A Z <4azfriend [at] gmail> wrote:
> Hi lan
>
> thanks for your reply.
>
> when i boosting each term while searching like   abcd is boost with boost
> factor of 10 and pqrst boost with boost factor of 5.
> then also it gives same score for documents
>
> *Query content:abcd^10.0 content:pqrst^5.0*
>
>
> title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score ->0.40883923
>
> title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score ->0.40883923
>
> title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score ->0.40883923
>
> title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
> ->0.40883923
>
> title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
> ->0.40883923
> Thanks
>
> On Wed, Jan 25, 2012 at 8:38 PM, Ian Lea <ian.lea [at] gmail> wrote:
>
>> If you want particular search terms to be more important than others
>> you can use boosting.  See
>> http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting a
>> Term
>>
>> If you want the order of matched terms to matter, see PhraseQuery or
>> SpanQuery.  The latter is more flexible. See
>> http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for a
>> good writeup.
>>
>> And you can of course use combinations of everything.
>>
>>
>> --
>> Ian.
>>
>>
>>
>> On Tue, Jan 24, 2012 at 5:08 PM, A Z <4azfriend [at] gmail> wrote:
>> > Hi
>> >
>> >
>> >
>> > how can we assign custom score for each token/word.
>> >
>> >
>> >
>> > For Ex
>> >
>> > I have document
>> >
>> >
>> >
>> > 1    pqrst uvwx abcd
>> >
>> > 2    abcd pqrst uvwx
>> >
>> > 3    pqrst uvwx lmn
>> >
>> > 4    pqrst uvwx lmn abcd
>> >
>> > 5    pqrst abcd uvwx lmn
>> >
>> >
>> >
>> > *Now i m searching data ---> abcd pqrst*
>> >
>> > So it should give more weightage score to 2nd document then 1st document
>> >
>> >
>> >
>> > So i want is
>> >
>> > *document 1 :---*    *pqrst *has more *weight * then   *uvwx *word and
>> *then
>> >  abcd *word
>> >
>> > *document 2* *:---*    *abcd *has more *weight * then   *pqrst*  word
>> > and *then  uvwx
>> > *word
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


4azfriend at gmail

Feb 4, 2012, 2:11 AM

Post #6 of 7 (348 views)
Permalink
Re: weightage of each word according to precedence in document [In reply to]

hi lan,

sorry for late reply ,

it is simple search with default similarity only,
here it gives same score for doc which has both token that is abcd pqrst,
there is no more weight for doc which has predence of abcd in document .

here is output with score and searcher.explain


Query content:abcd^10.0 content:pqrst^5.0

*title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score ->0.6175326
*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 0), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 0), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=0)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 0), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 0), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=0)

*title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score ->0.6175326
*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 1), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 1), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=1)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 1), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 1), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=1)

*title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
->0.6175326*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 3), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 3), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=3)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 3), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 3), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=3)

*title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
->0.6175326*

Searcher.explain -> 0.6175326 = (MATCH) sum of:

0.46281427 = (MATCH) weight(content:abcd^10.0 in 4), product of:

0.92562854 = queryWeight(content:abcd^10.0), product of:

10.0 = boost

1.0 = idf(docFreq=4, maxDocs=5)

0.092562854 = queryNorm

0.5 = (MATCH) fieldWeight(content:abcd in 4), product of:

1.0 = tf(termFreq(content:abcd)=1)

1.0 = idf(docFreq=4, maxDocs=5)

0.5 = fieldNorm(field=content, doc=4)

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 4), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 4), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=4)

*title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score ->0.07735918*

Searcher.explain -> 0.07735918 = (MATCH) product of:

0.15471835 = (MATCH) sum of:

0.15471835 = (MATCH) weight(content:pqrst^5.0 in 2), product of:

0.37843326 = queryWeight(content:pqrst^5.0), product of:

5.0 = boost

0.81767845 = idf(docFreq=5, maxDocs=5)

0.092562854 = queryNorm

0.40883923 = (MATCH) fieldWeight(content:pqrst in 2), product of:

1.0 = tf(termFreq(content:pqrst)=1)

0.81767845 = idf(docFreq=5, maxDocs=5)

0.5 = fieldNorm(field=content, doc=2)

0.5 = coord(1/2)


On Mon, Jan 30, 2012 at 2:59 PM, Ian Lea <ian.lea [at] gmail> wrote:

> They all give exactly the same score, even the 3rd doc which doesn't
> contain abcd at all? Surprising. What does searcher.explain() say?
> Is this a simple search with default Similarity or is there stuff
> you're not telling us?
>
> --
> Ian.
>
>
> On Sat, Jan 28, 2012 at 4:44 AM, A Z <4azfriend [at] gmail> wrote:
> > Hi lan
> >
> > thanks for your reply.
> >
> > when i boosting each term while searching like abcd is boost with boost
> > factor of 10 and pqrst boost with boost factor of 5.
> > then also it gives same score for documents
> >
> > *Query content:abcd^10.0 content:pqrst^5.0*
> >
> >
> > title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score
> ->0.40883923
> >
> > title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score
> ->0.40883923
> >
> > title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score
> ->0.40883923
> >
> > title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
> > ->0.40883923
> >
> > title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
> > ->0.40883923
> > Thanks
> >
> > On Wed, Jan 25, 2012 at 8:38 PM, Ian Lea <ian.lea [at] gmail> wrote:
> >
> >> If you want particular search terms to be more important than others
> >> you can use boosting. See
> >> http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting a
> >> Term
> >>
> >> If you want the order of matched terms to matter, see PhraseQuery or
> >> SpanQuery. The latter is more flexible. See
> >> http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for a
> >> good writeup.
> >>
> >> And you can of course use combinations of everything.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >>
> >> On Tue, Jan 24, 2012 at 5:08 PM, A Z <4azfriend [at] gmail> wrote:
> >> > Hi
> >> >
> >> >
> >> >
> >> > how can we assign custom score for each token/word.
> >> >
> >> >
> >> >
> >> > For Ex
> >> >
> >> > I have document
> >> >
> >> >
> >> >
> >> > 1 pqrst uvwx abcd
> >> >
> >> > 2 abcd pqrst uvwx
> >> >
> >> > 3 pqrst uvwx lmn
> >> >
> >> > 4 pqrst uvwx lmn abcd
> >> >
> >> > 5 pqrst abcd uvwx lmn
> >> >
> >> >
> >> >
> >> > *Now i m searching data ---> abcd pqrst*
> >> >
> >> > So it should give more weightage score to 2nd document then 1st
> document
> >> >
> >> >
> >> >
> >> > So i want is
> >> >
> >> > *document 1 :---* *pqrst *has more *weight * then *uvwx *word and
> >> *then
> >> > abcd *word
> >> >
> >> > *document 2* *:---* *abcd *has more *weight * then *pqrst* word
> >> > and *then uvwx
> >> > *word
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> >> For additional commands, e-mail: java-user-help [at] lucene
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


ian.lea at gmail

Feb 6, 2012, 3:13 AM

Post #7 of 7 (326 views)
Permalink
Re: weightage of each word according to precedence in document [In reply to]

At least it doesn't give the same score for a doc which doesn't have
all the terms which I think at one point you claimed.

So to try and simplify this, you've got one field called content and

doc1: pqrst uvwx abcd
doc2: abcd pqrst uvwx

and the query "abcd^10.0 content:pqrst^5.0" gives the same score for
doc1 and doc2. That is to be expected since both docs are the same
length and both contain both search terms.

As I said before, if you want the order of matched terms to matter,
see PhraseQuery or SpanQuery.

Or store positional info in a Payload and factor that in somehow.
Powerful but complicated. See
http://www.lucidimagination.com/blog/2010/04/18/refresh-getting-started-with-payloads/
for an example.

I can't think of another way to make, in your case, abcd score higher
if is first rather than third term in the doc. I'd try a SpanQuery
with some reasonable slop value and add it as an optional clause to
your query, possibly with a boost.


--
Ian.


On Sat, Feb 4, 2012 at 10:11 AM, A Z <4azfriend [at] gmail> wrote:
> hi lan,
>
> sorry for late reply ,
>
> it is simple search with default similarity only,
> here it gives same score for doc which has both token that is abcd pqrst,
> there is no more weight for doc which has predence of abcd in document .
>
> here is output with score and searcher.explain
>
>
> Query content:abcd^10.0 content:pqrst^5.0
>
> *title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score ->0.6175326
> *
>
> Searcher.explain -> 0.6175326 = (MATCH) sum of:
>
> 0.46281427 = (MATCH) weight(content:abcd^10.0 in 0), product of:
>
> 0.92562854 = queryWeight(content:abcd^10.0), product of:
>
> 10.0 = boost
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.5 = (MATCH) fieldWeight(content:abcd in 0), product of:
>
> 1.0 = tf(termFreq(content:abcd)=1)
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=0)
>
> 0.15471835 = (MATCH) weight(content:pqrst^5.0 in 0), product of:
>
> 0.37843326 = queryWeight(content:pqrst^5.0), product of:
>
> 5.0 = boost
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.40883923 = (MATCH) fieldWeight(content:pqrst in 0), product of:
>
> 1.0 = tf(termFreq(content:pqrst)=1)
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=0)
>
> *title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score ->0.6175326
> *
>
> Searcher.explain -> 0.6175326 = (MATCH) sum of:
>
> 0.46281427 = (MATCH) weight(content:abcd^10.0 in 1), product of:
>
> 0.92562854 = queryWeight(content:abcd^10.0), product of:
>
> 10.0 = boost
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.5 = (MATCH) fieldWeight(content:abcd in 1), product of:
>
> 1.0 = tf(termFreq(content:abcd)=1)
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=1)
>
> 0.15471835 = (MATCH) weight(content:pqrst^5.0 in 1), product of:
>
> 0.37843326 = queryWeight(content:pqrst^5.0), product of:
>
> 5.0 = boost
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.40883923 = (MATCH) fieldWeight(content:pqrst in 1), product of:
>
> 1.0 = tf(termFreq(content:pqrst)=1)
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=1)
>
> *title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
> ->0.6175326*
>
> Searcher.explain -> 0.6175326 = (MATCH) sum of:
>
> 0.46281427 = (MATCH) weight(content:abcd^10.0 in 3), product of:
>
> 0.92562854 = queryWeight(content:abcd^10.0), product of:
>
> 10.0 = boost
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.5 = (MATCH) fieldWeight(content:abcd in 3), product of:
>
> 1.0 = tf(termFreq(content:abcd)=1)
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=3)
>
> 0.15471835 = (MATCH) weight(content:pqrst^5.0 in 3), product of:
>
> 0.37843326 = queryWeight(content:pqrst^5.0), product of:
>
> 5.0 = boost
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.40883923 = (MATCH) fieldWeight(content:pqrst in 3), product of:
>
> 1.0 = tf(termFreq(content:pqrst)=1)
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=3)
>
> *title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
> ->0.6175326*
>
> Searcher.explain -> 0.6175326 = (MATCH) sum of:
>
> 0.46281427 = (MATCH) weight(content:abcd^10.0 in 4), product of:
>
> 0.92562854 = queryWeight(content:abcd^10.0), product of:
>
> 10.0 = boost
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.5 = (MATCH) fieldWeight(content:abcd in 4), product of:
>
> 1.0 = tf(termFreq(content:abcd)=1)
>
> 1.0 = idf(docFreq=4, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=4)
>
> 0.15471835 = (MATCH) weight(content:pqrst^5.0 in 4), product of:
>
> 0.37843326 = queryWeight(content:pqrst^5.0), product of:
>
> 5.0 = boost
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.40883923 = (MATCH) fieldWeight(content:pqrst in 4), product of:
>
> 1.0 = tf(termFreq(content:pqrst)=1)
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=4)
>
> *title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score ->0.07735918*
>
> Searcher.explain -> 0.07735918 = (MATCH) product of:
>
> 0.15471835 = (MATCH) sum of:
>
> 0.15471835 = (MATCH) weight(content:pqrst^5.0 in 2), product of:
>
> 0.37843326 = queryWeight(content:pqrst^5.0), product of:
>
> 5.0 = boost
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.092562854 = queryNorm
>
> 0.40883923 = (MATCH) fieldWeight(content:pqrst in 2), product of:
>
> 1.0 = tf(termFreq(content:pqrst)=1)
>
> 0.81767845 = idf(docFreq=5, maxDocs=5)
>
> 0.5 = fieldNorm(field=content, doc=2)
>
> 0.5 = coord(1/2)
>
>
> On Mon, Jan 30, 2012 at 2:59 PM, Ian Lea <ian.lea [at] gmail> wrote:
>
>> They all give exactly the same score, even the 3rd doc which doesn't
>> contain abcd at all?  Surprising.  What does searcher.explain() say?
>> Is this a simple search with default Similarity or is there stuff
>> you're not telling us?
>>
>> --
>> Ian.
>>
>>
>> On Sat, Jan 28, 2012 at 4:44 AM, A Z <4azfriend [at] gmail> wrote:
>> > Hi lan
>> >
>> > thanks for your reply.
>> >
>> > when i boosting each term while searching like   abcd is boost with boost
>> > factor of 10 and pqrst boost with boost factor of 5.
>> > then also it gives same score for documents
>> >
>> > *Query content:abcd^10.0 content:pqrst^5.0*
>>  >
>> >
>> > title ->pqrst uvwx abcd ::: content -> pqrst uvwx abcd::: Score
>> ->0.40883923
>> >
>> > title ->abcd pqrst uvwx ::: content -> abcd pqrst uvwx::: Score
>> ->0.40883923
>> >
>> > title ->pqrst uvwx lmn ::: content -> pqrst uvwx lmn::: Score
>> ->0.40883923
>> >
>> > title ->pqrst uvwx lmn abcd ::: content -> pqrst uvwx lmn abcd::: Score
>> > ->0.40883923
>> >
>> > title ->pqrst abcd uvwx lmn ::: content -> pqrst abcd uvwx lmn::: Score
>> > ->0.40883923
>> > Thanks
>> >
>> > On Wed, Jan 25, 2012 at 8:38 PM, Ian Lea <ian.lea [at] gmail> wrote:
>> >
>> >> If you want particular search terms to be more important than others
>> >> you can use boosting.  See
>> >> http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting a
>> >> Term
>> >>
>> >> If you want the order of matched terms to matter, see PhraseQuery or
>> >> SpanQuery.  The latter is more flexible. See
>> >> http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ for a
>> >> good writeup.
>> >>
>> >> And you can of course use combinations of everything.
>> >>
>> >>
>> >> --
>> >> Ian.
>> >>
>> >>
>> >>
>> >> On Tue, Jan 24, 2012 at 5:08 PM, A Z <4azfriend [at] gmail> wrote:
>> >> > Hi
>> >> >
>> >> >
>> >> >
>> >> > how can we assign custom score for each token/word.
>> >> >
>> >> >
>> >> >
>> >> > For Ex
>> >> >
>> >> > I have document
>> >> >
>> >> >
>> >> >
>> >> > 1    pqrst uvwx abcd
>> >> >
>> >> > 2    abcd pqrst uvwx
>> >> >
>> >> > 3    pqrst uvwx lmn
>> >> >
>> >> > 4    pqrst uvwx lmn abcd
>> >> >
>> >> > 5    pqrst abcd uvwx lmn
>> >> >
>> >> >
>> >> >
>> >> > *Now i m searching data ---> abcd pqrst*
>> >> >
>> >> > So it should give more weightage score to 2nd document then 1st
>> document
>> >> >
>> >> >
>> >> >
>> >> > So i want is
>> >> >
>> >> > *document 1 :---*    *pqrst *has more *weight * then   *uvwx *word and
>> >> *then
>> >> >  abcd *word
>> >> >
>> >> > *document 2* *:---*    *abcd *has more *weight * then   *pqrst*  word
>> >> > and *then  uvwx
>> >> > *word
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> >> For additional commands, e-mail: java-user-help [at] lucene
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.