Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Too many boolean clauses

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


skonopinsky at blueprint

Sep 20, 2004, 9:27 AM

Post #1 of 7 (310 views)
Permalink
Too many boolean clauses

Hello There,

Due to the fact that the [# TO #] range search works lexographically, I am
forced to build a rather large boolean query to get range data from my
index.

I have an ID field that contains about 500,000 unique ids. If I want to
query all records with ids [1-2000], I build a boolean query containing all
the numbers in the range. eg. id:(1 2 3 ... 1999 2000)

The problem with this is that I get the following error :
org.apache.lucene.queryParser.ParseException: Too many boolean clauses

Any ideas on how I might circumvent this issue by either finding a way to
rewrite the query, or avoid the error?

Thanks in advance,
Shawn.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta


paul.elschot at xs4all

Sep 20, 2004, 9:50 AM

Post #2 of 7 (298 views)
Permalink
Re: Too many boolean clauses [In reply to]

On Monday 20 September 2004 18:27, Shawn Konopinsky wrote:
> Hello There,
>
> Due to the fact that the [# TO #] range search works lexographically, I am
> forced to build a rather large boolean query to get range data from my
> index.
>
> I have an ID field that contains about 500,000 unique ids. If I want to
> query all records with ids [1-2000], I build a boolean query containing
> all the numbers in the range. eg. id:(1 2 3 ... 1999 2000)
>
> The problem with this is that I get the following error :
> org.apache.lucene.queryParser.ParseException: Too many boolean clauses
>
> Any ideas on how I might circumvent this issue by either finding a way to
> rewrite the query, or avoid the error?

You can use this as an example:

http://cvs.apache.org/viewcvs.cgi/jakarta-lucene/src/java/org/apache/lucene/search/DateFilter.java

(Just click view on the latest version to see the code).

and iteratate over you doc ids instead of over dates.
This will give you a filter for the doc ids you want to query.

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta


skonopinsky at blueprint

Sep 20, 2004, 11:54 AM

Post #3 of 7 (296 views)
Permalink
RE: Too many boolean clauses [In reply to]

Hey Paul,

Thanks for the quick reply. Excuse my ignorance, but what do I do with the
generated BitSet?

Also - we are using a pooling feature which contains a pool of
IndexSearchers that are used and tossed back each time we need to search.
I'd hate to have to work around this and open up an IndexReader for this
particular search, where all other searches use the pool. Suggestions?

Thanks,
Shawn.

-----Original Message-----
From: Paul Elschot [mailto:paul.elschot [at] xs4all]
Sent: Monday, September 20, 2004 12:51 PM
To: Lucene Users List
Subject: Re: Too many boolean clauses


On Monday 20 September 2004 18:27, Shawn Konopinsky wrote:
> Hello There,
>
> Due to the fact that the [# TO #] range search works lexographically, I am
> forced to build a rather large boolean query to get range data from my
> index.
>
> I have an ID field that contains about 500,000 unique ids. If I want to
> query all records with ids [1-2000], I build a boolean query containing
> all the numbers in the range. eg. id:(1 2 3 ... 1999 2000)
>
> The problem with this is that I get the following error :
> org.apache.lucene.queryParser.ParseException: Too many boolean clauses
>
> Any ideas on how I might circumvent this issue by either finding a way to
> rewrite the query, or avoid the error?

You can use this as an example:

http://cvs.apache.org/viewcvs.cgi/jakarta-lucene/src/java/org/apache/lucene/
search/DateFilter.java

(Just click view on the latest version to see the code).

and iteratate over you doc ids instead of over dates.
This will give you a filter for the doc ids you want to query.

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta


paul.elschot at xs4all

Sep 20, 2004, 12:05 PM

Post #4 of 7 (300 views)
Permalink
Re: Too many boolean clauses [In reply to]

On Monday 20 September 2004 20:54, Shawn Konopinsky wrote:
> Hey Paul,
>
> Thanks for the quick reply. Excuse my ignorance, but what do I do with the
> generated BitSet?

You can return it in in the bits() method of the object implementing your
org.apache.lucene.search.Filter (http://jakarta.apache.org/lucene/docs/api/index.html)
Then pass the Filter to IndexSearcher.search() with the query.

Regards,
Paul


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta


paul.elschot at xs4all

Sep 20, 2004, 1:22 PM

Post #5 of 7 (301 views)
Permalink
Re: Too many boolean clauses [In reply to]

On Monday 20 September 2004 20:54, Shawn Konopinsky wrote:
> Hey Paul,
>
...
>
> Also - we are using a pooling feature which contains a pool of
> IndexSearchers that are used and tossed back each time we need to search.
> I'd hate to have to work around this and open up an IndexReader for this
> particular search, where all other searches use the pool. Suggestions?

You could use a map from the IndexSearcher back to the IndexReader that was
used to create it. (It's a bit of a waste because the IndexSearcher has a reader
attribute internally.)

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta


skonopinsky at blueprint

Sep 20, 2004, 1:30 PM

Post #6 of 7 (297 views)
Permalink
RE: Too many boolean clauses [In reply to]

Sounds good. Thanks for all the help.

Shawn.

-----Original Message-----
From: Paul Elschot [mailto:paul.elschot [at] xs4all]
Sent: Monday, September 20, 2004 4:22 PM
To: lucene-user [at] jakarta
Subject: Re: Too many boolean clauses


On Monday 20 September 2004 20:54, Shawn Konopinsky wrote:
> Hey Paul,
>
...
>
> Also - we are using a pooling feature which contains a pool of
> IndexSearchers that are used and tossed back each time we need to search.
> I'd hate to have to work around this and open up an IndexReader for this
> particular search, where all other searches use the pool. Suggestions?

You could use a map from the IndexSearcher back to the IndexReader that was
used to create it. (It's a bit of a waste because the IndexSearcher has a
reader
attribute internally.)

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe [at] jakarta
For additional commands, e-mail: lucene-user-help [at] jakarta


uwe at thetaphi

Feb 1, 2012, 12:26 AM

Post #7 of 7 (92 views)
Permalink
RE: too many boolean clauses [In reply to]

I would recommend to use TermsFilter (http://goo.gl/BC9eQ, possibly wrapped
by a ConstantScoreQuery). You must do the query building by hand, yuery
*parser* cannot do that:

TermsFilter tf = new TermsFilter(); // it is in lucene-queries.jar
tf.addTerm(new Term("id", val1));
tf.addTerm(new Term("id", val2));
tf.addTerm(new Term("id", val3));
tf.addTerm(new Term("id", val4));
// if you need a query and don't want to use a Filter:
Query wrappedQ = new ConstantScoreQuery(tf);

You can execute the Filter as addon to your already prepared query:

searcher.search(queryParser.parse("content: (hello world)"), filter,...);

Or you use the wrapped as ConstantScore and combine it with the query:

BooleanQuery bq = new BooleanQuery();
bq.add(queryParser.parse("content: (hello world)"),
BooleanClause.Occur.MUST);
bq.add(wrapped, BooleanClause.Occur.MUST);
searcher.search(bq,...);

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe [at] thetaphi

> -----Original Message-----
> From: Praveen Yarlagadda [mailto:praveen.yarlagadda [at] gmail]
> Sent: Wednesday, February 01, 2012 8:51 AM
> To: java-user [at] lucene
> Subject: too many boolean clauses
>
> Hi all,
>
> I have been using lucene with Hibernate to index the data. Each document
is
> indexed with two fields: id and content. Each document corresponds to a
record
> in the database. In my usecase, search needs to work like this:
>
> 1. Fetch records from the database based on some criteria 2. Search for
the
> keywords only in the records found above
>
> I am preparing the search query like this: +(content: (hello world)) +(id:
> (234 235 899 534 345 898))
>
> If the number of documents (in the identifier field) reaches more than
1024,
> search fails with "too many boolean clauses". I can't use range query.
>
> Is there any other way to prepare the search query? How do I search for
> keywords in select documents?
>
> If you have any suggestions, please let me know.
>
> Thanks,
> Praveen


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.