
dameriangr at gmail
Feb 19, 2012, 10:46 AM
Post #1 of 1
(106 views)
Permalink
|
|
Implement a custom similarity
|
|
Hello, I am really new to Lucene, last week through this list i was really successfull into finding a solution to my problem. I have a new question now, i am trying to implement a new similarity class that uses the Jaccard coefficient, i have been reading the javadocs and a lot of other webpages on the matter, but my problem is that i still cannot understand how to do it. So far i know that i have to subclass the DefaultSimilarity and (if i am not wrong) i have to edit all the build in methods to return the corect score. Since Jaccard coefficiency is the conjuction of the query/document sets divided by the union of the two sets i think i only need the coord(q,d) and all the rest measures in the default similarity can return 1 to the score computation. My problem is that i cannot locate how to obtain the number of terms that each document has. Also do you think this approach is correct? I would be gratefull if you could give me advice or point towards a tutorial on the matter cause two days of searching were fruitless in finding an example code. Thank you in advance. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe [at] lucene For additional commands, e-mail: java-user-help [at] lucene
|