Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

Modify Norm

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


sendtoprat at yahoo

Feb 11, 2009, 10:40 AM

Post #1 of 8 (1843 views)
Permalink
Modify Norm

Hi
How can I set my own norm value during indexing time. From my quick reading
of code, I believe that norm values are written by NormsWriter class which
is called from final class DocumentsWriter. Norm values are set by calling
org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int
numTokens). But I need to set the norm using the field value which is a
float number. I will be using this norm for my retrieval model. I cannot use
document.setBoost(), as i need to boost the fields differently.

thanks
Pratyush
--
View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21959177.html
Sent from the Lucene - General mailing list archive at Nabble.com.


ragia11 at hotmail

Feb 12, 2009, 1:20 AM

Post #2 of 8 (1760 views)
Permalink
RE: Modify Norm [In reply to]

http://www.arabtranslators.net/
Ragia> Date: Wed, 11 Feb 2009 10:40:14 -0800> From: sendtoprat [at] yahoo> To: general [at] lucene> Subject: Modify Norm> > > Hi> How can I set my own norm value during indexing time. From my quick reading> of code, I believe that norm values are written by NormsWriter class which> is called from final class DocumentsWriter. Norm values are set by calling> org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int> numTokens). But I need to set the norm using the field value which is a> float number. I will be using this norm for my retrieval model. I cannot use> document.setBoost(), as i need to boost the fields differently.> > thanks> Pratyush> -- > View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21959177.html> Sent from the Lucene - General mailing list archive at Nabble.com.>
_________________________________________________________________
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx


ragia11 at hotmail

Feb 12, 2009, 1:42 AM

Post #3 of 8 (1769 views)
Permalink
RE: Modify Norm [In reply to]

sorry I selected the wrong message to reply at .
Ragia> From: ragia11 [at] hotmail> To: general [at] lucene> Subject: RE: Modify Norm> Date: Thu, 12 Feb 2009 09:20:50 +0000> > > http://www.arabtranslators.net/> Ragia> Date: Wed, 11 Feb 2009 10:40:14 -0800> From: sendtoprat [at] yahoo> To: general [at] lucene> Subject: Modify Norm> > > Hi> How can I set my own norm value during indexing time. From my quick reading> of code, I believe that norm values are written by NormsWriter class which> is called from final class DocumentsWriter. Norm values are set by calling> org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int> numTokens). But I need to set the norm using the field value which is a> float number. I will be using this norm for my retrieval model. I cannot use> document.setBoost(), as i need to boost the fields differently.> > thanks> Pratyush> -- > View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21959177.html> Sent from the Lucene - General mailing list archive at
Nabble.com.> > _________________________________________________________________> News, entertainment and everything you care about at Live.com. Get it now!> http://www.live.com/getstarted.aspx
_________________________________________________________________
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx


lucene at mikemccandless

Feb 12, 2009, 2:31 AM

Post #4 of 8 (1759 views)
Permalink
Re: Modify Norm [In reply to]

[You'd probably get more responses on java-user@ instead of general@]

Could you use per-field boost?

Or, since it sounds like you have a separate field with the boost you
want,
maybe you could do this all at search time using a function query?

An advanced possibility is to make your own indexing chain, and use a
different
NormsWriter, but that's an extremely big hammer to pull out for this
nail.

Mike

sendtoprat [at] yahoo wrote:

>
> Hi
> How can I set my own norm value during indexing time. From my quick
> reading
> of code, I believe that norm values are written by NormsWriter class
> which
> is called from final class DocumentsWriter. Norm values are set by
> calling
> org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int
> numTokens). But I need to set the norm using the field value which
> is a
> float number. I will be using this norm for my retrieval model. I
> cannot use
> document.setBoost(), as i need to boost the fields differently.
>
> thanks
> Pratyush
> --
> View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21959177.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>


sendtoprat at yahoo

Feb 12, 2009, 12:46 PM

Post #5 of 8 (1748 views)
Permalink
Re: Modify Norm [In reply to]

I guess I could use, String[] s =
ExtendedFieldCache.EXT_DEFAULT.getStrings(reader, fieldname);
It will consume memory but should not be that bad. For my index this field
has sortable string value which I could use in my model. I only need to do
this for one field. Comment ?

Pratyush

Michael McCandless-2 wrote:
>
>
> [You'd probably get more responses on java-user@ instead of general@]
>
> Could you use per-field boost?
>
> Or, since it sounds like you have a separate field with the boost you
> want,
> maybe you could do this all at search time using a function query?
>
> An advanced possibility is to make your own indexing chain, and use a
> different
> NormsWriter, but that's an extremely big hammer to pull out for this
> nail.
>
> Mike
>
> sendtoprat [at] yahoo wrote:
>
>>
>> Hi
>> How can I set my own norm value during indexing time. From my quick
>> reading
>> of code, I believe that norm values are written by NormsWriter class
>> which
>> is called from final class DocumentsWriter. Norm values are set by
>> calling
>> org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int
>> numTokens). But I need to set the norm using the field value which
>> is a
>> float number. I will be using this norm for my retrieval model. I
>> cannot use
>> document.setBoost(), as i need to boost the fields differently.
>>
>> thanks
>> Pratyush
>> --
>> View this message in context:
>> http://www.nabble.com/Modify-Norm-tp21959177p21959177.html
>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>
>
>
>

--
View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21984666.html
Sent from the Lucene - General mailing list archive at Nabble.com.


lucene at mikemccandless

Feb 12, 2009, 4:14 PM

Post #6 of 8 (1753 views)
Permalink
Re: Modify Norm [In reply to]

Can you simply sort by that string field? (Which also uses FieldCache
under the hood to get the values).

Mike

On Feb 12, 2009, at 3:46 PM, sendtoprat [at] yahoo wrote:

>
> I guess I could use, String[] s =
> ExtendedFieldCache.EXT_DEFAULT.getStrings(reader, fieldname);
> It will consume memory but should not be that bad. For my index this
> field
> has sortable string value which I could use in my model. I only need
> to do
> this for one field. Comment ?
>
> Pratyush
>
> Michael McCandless-2 wrote:
>>
>>
>> [You'd probably get more responses on java-user@ instead of general@]
>>
>> Could you use per-field boost?
>>
>> Or, since it sounds like you have a separate field with the boost you
>> want,
>> maybe you could do this all at search time using a function query?
>>
>> An advanced possibility is to make your own indexing chain, and use a
>> different
>> NormsWriter, but that's an extremely big hammer to pull out for this
>> nail.
>>
>> Mike
>>
>> sendtoprat [at] yahoo wrote:
>>
>>>
>>> Hi
>>> How can I set my own norm value during indexing time. From my quick
>>> reading
>>> of code, I believe that norm values are written by NormsWriter class
>>> which
>>> is called from final class DocumentsWriter. Norm values are set by
>>> calling
>>> org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int
>>> numTokens). But I need to set the norm using the field value which
>>> is a
>>> float number. I will be using this norm for my retrieval model. I
>>> cannot use
>>> document.setBoost(), as i need to boost the fields differently.
>>>
>>> thanks
>>> Pratyush
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Modify-Norm-tp21959177p21959177.html
>>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21984666.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>


sendtoprat at yahoo

Feb 12, 2009, 4:19 PM

Post #7 of 8 (1750 views)
Permalink
Re: Modify Norm [In reply to]

Ya. Earlier we were sorting only. But sorting didn't always result in optimal
ranking as it completely ignores Lucene score. We do not want that. We want
a combination of different features, and not necessarily sorting by one
field values.

Pratyush


Michael McCandless-2 wrote:
>
>
> Can you simply sort by that string field? (Which also uses FieldCache
> under the hood to get the values).
>
> Mike
>
> On Feb 12, 2009, at 3:46 PM, sendtoprat [at] yahoo wrote:
>
>>
>> I guess I could use, String[] s =
>> ExtendedFieldCache.EXT_DEFAULT.getStrings(reader, fieldname);
>> It will consume memory but should not be that bad. For my index this
>> field
>> has sortable string value which I could use in my model. I only need
>> to do
>> this for one field. Comment ?
>>
>> Pratyush
>>
>> Michael McCandless-2 wrote:
>>>
>>>
>>> [You'd probably get more responses on java-user@ instead of general@]
>>>
>>> Could you use per-field boost?
>>>
>>> Or, since it sounds like you have a separate field with the boost you
>>> want,
>>> maybe you could do this all at search time using a function query?
>>>
>>> An advanced possibility is to make your own indexing chain, and use a
>>> different
>>> NormsWriter, but that's an extremely big hammer to pull out for this
>>> nail.
>>>
>>> Mike
>>>
>>> sendtoprat [at] yahoo wrote:
>>>
>>>>
>>>> Hi
>>>> How can I set my own norm value during indexing time. From my quick
>>>> reading
>>>> of code, I believe that norm values are written by NormsWriter class
>>>> which
>>>> is called from final class DocumentsWriter. Norm values are set by
>>>> calling
>>>> org.apache.lucene.search.Similarity#lengthNorm(String fieldName, int
>>>> numTokens). But I need to set the norm using the field value which
>>>> is a
>>>> float number. I will be using this norm for my retrieval model. I
>>>> cannot use
>>>> document.setBoost(), as i need to boost the fields differently.
>>>>
>>>> thanks
>>>> Pratyush
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Modify-Norm-tp21959177p21959177.html
>>>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>>>
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Modify-Norm-tp21959177p21984666.html
>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>
>
>
>

--
View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21987878.html
Sent from the Lucene - General mailing list archive at Nabble.com.


lucene at mikemccandless

Feb 17, 2009, 3:33 AM

Post #8 of 8 (1710 views)
Permalink
Re: Modify Norm [In reply to]

It sounds like function queries would be a fit here -- they let you
customize how a documents score is computed, and eg you could blend
the "normal" Lucene relevance with your own contribution, per document.

Mike

sendtoprat [at] yahoo wrote:

>
> Ya. Earlier we were sorting only. But sorting didn't always result
> in optimal
> ranking as it completely ignores Lucene score. We do not want that.
> We want
> a combination of different features, and not necessarily sorting by
> one
> field values.
>
> Pratyush
>
>
> Michael McCandless-2 wrote:
>>
>>
>> Can you simply sort by that string field? (Which also uses
>> FieldCache
>> under the hood to get the values).
>>
>> Mike
>>
>> On Feb 12, 2009, at 3:46 PM, sendtoprat [at] yahoo wrote:
>>
>>>
>>> I guess I could use, String[] s =
>>> ExtendedFieldCache.EXT_DEFAULT.getStrings(reader, fieldname);
>>> It will consume memory but should not be that bad. For my index this
>>> field
>>> has sortable string value which I could use in my model. I only need
>>> to do
>>> this for one field. Comment ?
>>>
>>> Pratyush
>>>
>>> Michael McCandless-2 wrote:
>>>>
>>>>
>>>> [.You'd probably get more responses on java-user@ instead of
>>>> general@]
>>>>
>>>> Could you use per-field boost?
>>>>
>>>> Or, since it sounds like you have a separate field with the boost
>>>> you
>>>> want,
>>>> maybe you could do this all at search time using a function query?
>>>>
>>>> An advanced possibility is to make your own indexing chain, and
>>>> use a
>>>> different
>>>> NormsWriter, but that's an extremely big hammer to pull out for
>>>> this
>>>> nail.
>>>>
>>>> Mike
>>>>
>>>> sendtoprat [at] yahoo wrote:
>>>>
>>>>>
>>>>> Hi
>>>>> How can I set my own norm value during indexing time. From my
>>>>> quick
>>>>> reading
>>>>> of code, I believe that norm values are written by NormsWriter
>>>>> class
>>>>> which
>>>>> is called from final class DocumentsWriter. Norm values are set by
>>>>> calling
>>>>> org.apache.lucene.search.Similarity#lengthNorm(String fieldName,
>>>>> int
>>>>> numTokens). But I need to set the norm using the field value which
>>>>> is a
>>>>> float number. I will be using this norm for my retrieval model. I
>>>>> cannot use
>>>>> document.setBoost(), as i need to boost the fields differently.
>>>>>
>>>>> thanks
>>>>> Pratyush
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/Modify-Norm-tp21959177p21959177.html
>>>>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Modify-Norm-tp21959177p21984666.html
>>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Modify-Norm-tp21959177p21987878.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.