Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

question about wildcard like search

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


chadmichaeldavis at gmail

Oct 16, 2008, 11:55 AM

Post #1 of 2 (139 views)
Permalink
question about wildcard like search

I need to do a query where i'm looking for strings that are embedded into a
single word in one of the fields. In other words, a field my have a phrase
like:

Bob,Tom,Kevin,Jeff

or

Tom,Doug,Steven,Bob


I would like to be able to use the wildcard query to search for any document
that has the name "Tom" embedded, in any fashion, in this field.

I would like to have built a WildCardQuery like "*Tom*", but it doesn't
accept * as the first character, due to performance reasons the
documentation explains.

So, how do I do such a query? I'm looking into the fuzzy logic query, right
now.


john.byrne at therogueprocess

Oct 17, 2008, 12:42 AM

Post #2 of 2 (126 views)
Permalink
Re: question about wildcard like search [In reply to]

Hi,

I think you'll normally get quicker answers for this type of question on
the main Lucene users mailing list: java-user[at]lucene.apache.org

Anyway, you just need to call the QueryParser method
'setAllowLeadingWildcard(true)' to allow leading wildcards.

However, once you do that, your leading wildcard queries will probably
expand into many terms, and you should also therefore call
'BooleanQuery.setMaxClauseCount' with a large number - you could use
Integer.MAX_VALUE, but as far as I know this can cause a problem if you
use FuzzyQuerys. The number of terms in your index is the largest you'll
need - but presumably that can grow.

I would just set it at 1 million or something like that. You're problaby
never going to have a million terms.

-John

ChadDavis wrote:
> I need to do a query where i'm looking for strings that are embedded into a
> single word in one of the fields. In other words, a field my have a phrase
> like:
>
> Bob,Tom,Kevin,Jeff
>
> or
>
> Tom,Doug,Steven,Bob
>
>
> I would like to be able to use the wildcard query to search for any document
> that has the name "Tom" embedded, in any fashion, in this field.
>
> I would like to have built a WildCardQuery like "*Tom*", but it doesn't
> accept * as the first character, due to performance reasons the
> documentation explains.
>
> So, how do I do such a query? I'm looking into the fuzzy logic query, right
> now.
>
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - http://www.avg.com
> Version: 8.0.173 / Virus Database: 270.8.1/1728 - Release Date: 10/16/2008 7:38 AM
>
>

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.