Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
Grouping Based on Multiple Fields Similarity
 

Index | Next | Previous | View Flat


java.phi at phi-integration

May 21, 2012, 7:31 AM


Views: 320
Permalink
Grouping Based on Multiple Fields Similarity

Hi Everyone,

I'm quite new to Lucene and would like to ask if my case below is possible
with Lucene solution.

Let's say I have 200,000 rows from a relational table with multiple fields,
and I will have them indexed with Lucene. After indexing, I'd like to have
a grouping / clustering based on similarity between four of five fields.

The end result would be something like this :

- Grouping 1, count : 3
- row id = 1
- row id = 23
- row id = 100
- Grouping 2
- row id = 1
- row id = 23
- ...

I have done some research and MoreLikeThis class can be use on this :

http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/api/contrib-queries/org/apache/lucene/search/similar/MoreLikeThis.html

I'm still learning the usage of this class. But maybe if anyone can confirm
the approach and maybe some guide, it would be very appreciated.

Many thanks before..

Regards,

Robby

Subject User Time
Grouping Based on Multiple Fields Similarity java.phi at phi-integration May 21, 2012, 7:31 AM
    Re: Grouping Based on Multiple Fields Similarity java.phi at phi-integration May 21, 2012, 9:02 AM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.