Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

WildCardQuery and TooManyClauses

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


chose77 at gmail

Apr 10, 2008, 5:53 AM

Post #1 of 10 (1383 views)
Permalink
WildCardQuery and TooManyClauses

Hello everybody,
I know there was written a tons of words about this issue, but I'm just not
clear enough about it.

I have these facts:

1. my query is always 1 letter and *, eg. M*
2. i always want to get max 200 results, no more!
3. i don't want to fix this issue by setting maxClauseCount

I just don't see the easy way how to get my results, did i missed something?

From what I've read here I know that probably i should play with filters or
with WildCardEnum, but why?
I just want to get simple this:
SELECT FROM XXX WHERE XXX.name LIKE 'M%' LIMIT 200;

(there is no filtering in this query except the wildcard itself)

Please, what is the easiest solution to achieve this?

Thanks in advance,
Chose


gresh at us

Apr 10, 2008, 6:02 AM

Post #2 of 10 (1314 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

Doesn't the following do what you want with maxnumhits =200?
TopDocs td;
td = indexSearcher.search(query, filter, maxnumhits);
where filter can be null



Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh [at] us


"Joe K" <chose77 [at] gmail> wrote on 04/10/2008 08:53:06 AM:

> Hello everybody,
> I know there was written a tons of words about this issue, but I'm just
not
> clear enough about it.
>
> I have these facts:
>
> 1. my query is always 1 letter and *, eg. M*
> 2. i always want to get max 200 results, no more!
> 3. i don't want to fix this issue by setting maxClauseCount
>
> I just don't see the easy way how to get my results, did i missed
something?
>
> From what I've read here I know that probably i should play with filters
or
> with WildCardEnum, but why?
> I just want to get simple this:
> SELECT FROM XXX WHERE XXX.name LIKE 'M%' LIMIT 200;
>
> (there is no filtering in this query except the wildcard itself)
>
> Please, what is the easiest solution to achieve this?
>
> Thanks in advance,
> Chose


chose77 at gmail

Apr 10, 2008, 6:24 AM

Post #3 of 10 (1301 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

Hi Donna,
thanks for the reply!

I didn't try yet, but you are probably right that this should work for me.
The filter parameter and the fact that TopDocs doesn't have the
getter to the scoreDocs were confusing to me.

Thanks a lot,
Chose


On Thu, Apr 10, 2008 at 3:02 PM, Donna L Gresh <gresh [at] us> wrote:

> Doesn't the following do what you want with maxnumhits =200?
> TopDocs td;
> td = indexSearcher.search(query, filter, maxnumhits);
> where filter can be null
>
>
>
> Donna L. Gresh
> Services Research, Mathematical Sciences Department
> IBM T.J. Watson Research Center
> (914) 945-2472
> http://www.research.ibm.com/people/g/donnagresh
> gresh [at] us
>
>
> "Joe K" <chose77 [at] gmail> wrote on 04/10/2008 08:53:06 AM:
>
> > Hello everybody,
> > I know there was written a tons of words about this issue, but I'm just
> not
> > clear enough about it.
> >
> > I have these facts:
> >
> > 1. my query is always 1 letter and *, eg. M*
> > 2. i always want to get max 200 results, no more!
> > 3. i don't want to fix this issue by setting maxClauseCount
> >
> > I just don't see the easy way how to get my results, did i missed
> something?
> >
> > From what I've read here I know that probably i should play with filters
> or
> > with WildCardEnum, but why?
> > I just want to get simple this:
> > SELECT FROM XXX WHERE XXX.name LIKE 'M%' LIMIT 200;
> >
> > (there is no filtering in this query except the wildcard itself)
> >
> > Please, what is the easiest solution to achieve this?
> >
> > Thanks in advance,
> > Chose
>


chose77 at gmail

Apr 10, 2008, 8:45 AM

Post #4 of 10 (1289 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

Donna,
so this doesn't work because search calls internaly MultiTermQuery.rewrite
which causes TooManyClauses exception anyway even if the maxnumhits
is set to 200 !!

So I am lost again...

Chose


On Thu, Apr 10, 2008 at 3:02 PM, Donna L Gresh <gresh [at] us> wrote:

> Doesn't the following do what you want with maxnumhits =200?
> TopDocs td;
> td = indexSearcher.search(query, filter, maxnumhits);
> where filter can be null
>
>
>
> Donna L. Gresh
> Services Research, Mathematical Sciences Department
> IBM T.J. Watson Research Center
> (914) 945-2472
> http://www.research.ibm.com/people/g/donnagresh
> gresh [at] us
>
>
> "Joe K" <chose77 [at] gmail> wrote on 04/10/2008 08:53:06 AM:
>
> > Hello everybody,
> > I know there was written a tons of words about this issue, but I'm just
> not
> > clear enough about it.
> >
> > I have these facts:
> >
> > 1. my query is always 1 letter and *, eg. M*
> > 2. i always want to get max 200 results, no more!
> > 3. i don't want to fix this issue by setting maxClauseCount
> >
> > I just don't see the easy way how to get my results, did i missed
> something?
> >
> > From what I've read here I know that probably i should play with filters
> or
> > with WildCardEnum, but why?
> > I just want to get simple this:
> > SELECT FROM XXX WHERE XXX.name LIKE 'M%' LIMIT 200;
> >
> > (there is no filtering in this query except the wildcard itself)
> >
> > Please, what is the easiest solution to achieve this?
> >
> > Thanks in advance,
> > Chose
>


Brian.Beard at mybir

Apr 14, 2008, 11:58 AM

Post #5 of 10 (1270 views)
Permalink
RE: WildCardQuery and TooManyClauses [In reply to]

You can use your approach w/ or w/o the filter.
> td = indexSearcher.search(query, filter, maxnumhits);

You need to use a filter for the wildcards which is built in to the
query.

1) Extend QueryParser to override the getWildcardQuery method.
(Or even if you don't use QueryParser, just use the query api and
combine the ConstantScoreQuery in #2 with your own query).
2) Inside of getWildcardQuery you need to return a
ConstantScoreQuery(new WildcardFilter(new Term(field, termStr)))
3) The first execution will take longer to initialize, but subsequent
searches are fairly fast.
4) Someone posted a WildcardFilter a while back which is below.
5) Now you can plug in to topDocs.

public class WildcardFilter extends Filter {

...

public BitSet bits(IndexReader reader) throws IOException {
BitSet bits = new BitSet(reader.maxDoc());
WildcardTermEnum enumerator = new WildcardTermEnum(reader,
term);
TermDocs termDocs = reader.termDocs();

try {
do {
Term term = enumerator.term();

if (term != null) {
termDocs.seek(term);

while (termDocs.next()) {
bits.set(termDocs.doc());
}
} else {
break;
}
} while (enumerator.next());
} finally {
termDocs.close();
enumerator.close();
}

return bits;
}
-----Original Message-----
From: Joe K [mailto:chose77 [at] gmail]
Sent: Thursday, April 10, 2008 11:46 AM
To: java-user [at] lucene
Subject: Re: WildCardQuery and TooManyClauses

Donna,
so this doesn't work because search calls internaly
MultiTermQuery.rewrite
which causes TooManyClauses exception anyway even if the maxnumhits
is set to 200 !!

So I am lost again...

Chose


On Thu, Apr 10, 2008 at 3:02 PM, Donna L Gresh <gresh [at] us> wrote:

> Doesn't the following do what you want with maxnumhits =200?
> TopDocs td;
> td = indexSearcher.search(query, filter, maxnumhits);
> where filter can be null
>
>
>
> Donna L. Gresh
> Services Research, Mathematical Sciences Department
> IBM T.J. Watson Research Center
> (914) 945-2472
> http://www.research.ibm.com/people/g/donnagresh
> gresh [at] us
>
>
> "Joe K" <chose77 [at] gmail> wrote on 04/10/2008 08:53:06 AM:
>
> > Hello everybody,
> > I know there was written a tons of words about this issue, but I'm
just
> not
> > clear enough about it.
> >
> > I have these facts:
> >
> > 1. my query is always 1 letter and *, eg. M*
> > 2. i always want to get max 200 results, no more!
> > 3. i don't want to fix this issue by setting maxClauseCount
> >
> > I just don't see the easy way how to get my results, did i missed
> something?
> >
> > From what I've read here I know that probably i should play with
filters
> or
> > with WildCardEnum, but why?
> > I just want to get simple this:
> > SELECT FROM XXX WHERE XXX.name LIKE 'M%' LIMIT 200;
> >
> > (there is no filtering in this query except the wildcard itself)
> >
> > Please, what is the easiest solution to achieve this?
> >
> > Thanks in advance,
> > Chose
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


chose77 at gmail

Apr 18, 2008, 9:25 AM

Post #6 of 10 (1223 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

Thank you very much Brian,
this works for me!

Chose


On Mon, Apr 14, 2008 at 8:58 PM, Beard, Brian <Brian.Beard [at] mybir> wrote:

> You can use your approach w/ or w/o the filter.
> > td = indexSearcher.search(query, filter, maxnumhits);
>
> You need to use a filter for the wildcards which is built in to the
> query.
>
> 1) Extend QueryParser to override the getWildcardQuery method.
> (Or even if you don't use QueryParser, just use the query api and
> combine the ConstantScoreQuery in #2 with your own query).
> 2) Inside of getWildcardQuery you need to return a
> ConstantScoreQuery(new WildcardFilter(new Term(field, termStr)))
> 3) The first execution will take longer to initialize, but subsequent
> searches are fairly fast.
> 4) Someone posted a WildcardFilter a while back which is below.
> 5) Now you can plug in to topDocs.
>
> public class WildcardFilter extends Filter {
>
> ...
>
> public BitSet bits(IndexReader reader) throws IOException {
> BitSet bits = new BitSet(reader.maxDoc());
> WildcardTermEnum enumerator = new WildcardTermEnum(reader,
> term);
> TermDocs termDocs = reader.termDocs();
>
> try {
> do {
> Term term = enumerator.term();
>
> if (term != null) {
> termDocs.seek(term);
>
> while (termDocs.next()) {
> bits.set(termDocs.doc());
> }
> } else {
> break;
> }
> } while (enumerator.next());
> } finally {
> termDocs.close();
> enumerator.close();
> }
>
> return bits;
> }
> -----Original Message-----
> From: Joe K [mailto:chose77 [at] gmail]
> Sent: Thursday, April 10, 2008 11:46 AM
> To: java-user [at] lucene
> Subject: Re: WildCardQuery and TooManyClauses
>
> Donna,
> so this doesn't work because search calls internaly
> MultiTermQuery.rewrite
> which causes TooManyClauses exception anyway even if the maxnumhits
> is set to 200 !!
>
> So I am lost again...
>
> Chose
>
>
> On Thu, Apr 10, 2008 at 3:02 PM, Donna L Gresh <gresh [at] us> wrote:
>
> > Doesn't the following do what you want with maxnumhits =200?
> > TopDocs td;
> > td = indexSearcher.search(query, filter, maxnumhits);
> > where filter can be null
> >
> >
> >
> > Donna L. Gresh
> > Services Research, Mathematical Sciences Department
> > IBM T.J. Watson Research Center
> > (914) 945-2472
> > http://www.research.ibm.com/people/g/donnagresh
> > gresh [at] us
> >
> >
> > "Joe K" <chose77 [at] gmail> wrote on 04/10/2008 08:53:06 AM:
> >
> > > Hello everybody,
> > > I know there was written a tons of words about this issue, but I'm
> just
> > not
> > > clear enough about it.
> > >
> > > I have these facts:
> > >
> > > 1. my query is always 1 letter and *, eg. M*
> > > 2. i always want to get max 200 results, no more!
> > > 3. i don't want to fix this issue by setting maxClauseCount
> > >
> > > I just don't see the easy way how to get my results, did i missed
> > something?
> > >
> > > From what I've read here I know that probably i should play with
> filters
> > or
> > > with WildCardEnum, but why?
> > > I just want to get simple this:
> > > SELECT FROM XXX WHERE XXX.name LIKE 'M%' LIMIT 200;
> > >
> > > (there is no filtering in this query except the wildcard itself)
> > >
> > > Please, what is the easiest solution to achieve this?
> > >
> > > Thanks in advance,
> > > Chose
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


injecteer at yahoo

Sep 18, 2008, 8:35 AM

Post #7 of 10 (863 views)
Permalink
RE: WildCardQuery and TooManyClauses [In reply to]

Beard, Brian wrote:
>
> 1) Extend QueryParser to override the getWildcardQuery method.
>

Kinda late :), but I still have another question:

Who calls that getWildcardQuery() method?

I subclassed the QueryParser, but that method does never get invoked, even
if the query contains *.

Shall I override some other methods? Or shall I call the method directly?

Thanx
--
View this message in context: http://www.nabble.com/WildCardQuery-and-TooManyClauses-tp16610177p19555644.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


lucene at mikemccandless

Sep 18, 2008, 8:45 AM

Post #8 of 10 (882 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

It's only with the trunk version of Lucene that QueryParser calls
getWildcardQuery on parsing a wildcard string from the user's query.

Mike

Konstantyn Smirnov wrote:

>
>
>
> Beard, Brian wrote:
>>
>> 1) Extend QueryParser to override the getWildcardQuery method.
>>
>
> Kinda late :), but I still have another question:
>
> Who calls that getWildcardQuery() method?
>
> I subclassed the QueryParser, but that method does never get
> invoked, even
> if the query contains *.
>
> Shall I override some other methods? Or shall I call the method
> directly?
>
> Thanx
> --
> View this message in context: http://www.nabble.com/WildCardQuery-and-TooManyClauses-tp16610177p19555644.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


injecteer at yahoo

Sep 18, 2008, 10:31 AM

Post #9 of 10 (879 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

Michael McCandless-2 wrote:
>
>
> It's only with the trunk version of Lucene that QueryParser calls
> getWildcardQuery on parsing a wildcard string from the user's query.
>
I see..

So, how can I plug the WildcardFilter in, to prevent TooManyClauses? Are
there other ways, than using the trunk?
--
View this message in context: http://www.nabble.com/WildCardQuery-and-TooManyClauses-tp16610177p19557943.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


injecteer at yahoo

Sep 19, 2008, 3:50 AM

Post #10 of 10 (872 views)
Permalink
Re: WildCardQuery and TooManyClauses [In reply to]

Konstantyn Smirnov wrote:
>
> So, how can I plug the WildcardFilter in, to prevent TooManyClauses? Are
> there other ways, than using the trunk?
>

Now I ended up in overriding also QueryParser.getPrefixQuery() method, using
ConstantScoreQuery and PrefixFilter. MaxClauseCountExc is gone, but is it
the right way?
--
View this message in context: http://www.nabble.com/WildCardQuery-and-TooManyClauses-tp16610177p19570188.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.