Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Lucene indexes and relationship

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


mnrz57 at gmail

Sep 14, 2007, 10:15 PM

Post #1 of 2 (1015 views)
Permalink
Lucene indexes and relationship

Hello,
In our application, we have many categories (indexes) in which different
kind of information have been indexed. we provided a facility for our users
to opt their category to search and we also provided a way that they select
more than one category to search, afterwards, we must return back the result
from selected categories in which they are related to each other and discard
other documents which their IDs are not common with the results from other
categories, in other words, an intersection of all results

currently, each category has an ID field and this help us to find which
document to select and which one to discard, in order to implement this, we
defined a HitCollector in which, we collect all the Lucene's document IDs to
load the document later as well as all category ID to find its relationship
with other result. we store these IDs in a java.util.BitSet structure to
save the memory and also for performance issues.
now this works fine, but it's messy, further more, to find all common
documents we have to compare all the results two by two and this is really
exhaustive and eat up the CPU resources and this deprived us from the
Lucene's speed of searching.

In my opinion, if this simple relationship could be embedded inside the
Lucene Api at low level sections of loading documents, this process would be
done very fast, and I think this feature will make many Lucene's user happy

I want to know, whether this is possible for Lucene developers or not, or it
is in their TO DO list or they are going to provide it in future.

thank you all Lucene developers and producers.

--
Regards,
Mohammad Norouzi
--------------------------
see my blog: http://brainable.blogspot.com/
another in Persian: http://fekre-motefavet.blogspot.com/


gsingers at apache

Sep 15, 2007, 4:30 AM

Post #2 of 2 (941 views)
Permalink
Re: Lucene indexes and relationship [In reply to]

Sounds like faceting, I think. Have you looked at Solr?

-Grant

On Sep 15, 2007, at 1:15 AM, Mohammad Norouzi wrote:

> Hello,
> In our application, we have many categories (indexes) in which
> different
> kind of information have been indexed. we provided a facility for
> our users
> to opt their category to search and we also provided a way that
> they select
> more than one category to search, afterwards, we must return back
> the result
> from selected categories in which they are related to each other
> and discard
> other documents which their IDs are not common with the results
> from other
> categories, in other words, an intersection of all results
>
> currently, each category has an ID field and this help us to find
> which
> document to select and which one to discard, in order to implement
> this, we
> defined a HitCollector in which, we collect all the Lucene's
> document IDs to
> load the document later as well as all category ID to find its
> relationship
> with other result. we store these IDs in a java.util.BitSet
> structure to
> save the memory and also for performance issues.
> now this works fine, but it's messy, further more, to find all common
> documents we have to compare all the results two by two and this is
> really
> exhaustive and eat up the CPU resources and this deprived us from the
> Lucene's speed of searching.
>
> In my opinion, if this simple relationship could be embedded inside
> the
> Lucene Api at low level sections of loading documents, this process
> would be
> done very fast, and I think this feature will make many Lucene's
> user happy
>
> I want to know, whether this is possible for Lucene developers or
> not, or it
> is in their TO DO list or they are going to provide it in future.
>
> thank you all Lucene developers and producers.
>
> --
> Regards,
> Mohammad Norouzi
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.