Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
RE: Analyzer on query question
 

Index | Next | Previous | View Flat


Bill.Chesky at learninga-z

Aug 3, 2012, 9:57 AM


Views: 561
Permalink
RE: Analyzer on query question [In reply to]

Ian,

I gave this method a try, at least the way I understood your suggestion. E.g. to search for the phrase "cells combine" I built up a string like:

title:"cells combine" description:"cells combine" text:"cells combine"

then I passed that to the queryParser.parse() method (where queryParser is an instance of QueryParser constructed using SnowballAnalyzer) and added the result as a MUST clause in my final BooleanQuery.

When I print the resulting query out as a string I get:

+(title:"cell combin" description:"cell combin" keywords:"cell combin")

So it looks like the SnowballAnalyzer is doing some stemming for me. But this is the exact same result I'd get doing it the way I described in my original email. I just built the unanalyzed string on my own rather than using the various query classes like PhraseQuery, etc.

So I don't see the advantage to doing it this way over the original method. I just don't know if the original way I described is wrong or will give me bad results.

thanks for the help,

Bill

-----Original Message-----
From: Ian Lea [mailto:ian.lea [at] gmail]
Sent: Friday, August 03, 2012 9:32 AM
To: java-user [at] lucene
Subject: Re: Analyzer on query question

You can add parsed queries to a BooleanQuery. Would that help in this case?

SnowballAnalyzer sba = whatever();
QueryParser qp = new QueryParser(..., sba);
Query q1 = qp.parse("some snowball string");
Query q2 = qp.parse("some other snowball string");

BooleanQuery bq = new BooleanQuery();
bq.add(q1, ...);
bq.add(q2, ...);
bq.add(loads of other stuff);


--
ian.


On Fri, Aug 3, 2012 at 2:19 PM, Bill Chesky <Bill.Chesky [at] learninga-z> wrote:
> Thanks Simon,
>
> Unfortunately, I'm using Lucene 3.0.1 and CharTermAttribute doesn't seem to have been introduced until 3.1.0. Similarly my version of Lucene does not have a BooleanQuery.addClause(BooleanClause) method. Maybe you meant BooleanQuery.add(BooleanClause).

>
> In any case, most of what you're doing there, I'm just not familiar with. Seems very low level. I've never had to use TokenStreams to build a query before and I'm not really sure what is going on there. Also, I don't know what PositionIncrementAttribute is or how it would be used to create a PhraseQuery. The way I'm currently creating PhraseQuerys is very straightforward and intuitive. E.g. to search for the term "foo bar" I'd build the query like this:
>
> PhraseQuery phraseQuery = new PhraseQuery();
> phraseQuery.add(new Term("title", "foo"));
> phraseQuery.add(new Term("title", "bar"));
>
> Is there really no easier way to associate the correct analyzer with these types of queries?
>
> Bill
>
> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer [at] gmail]
> Sent: Friday, August 03, 2012 3:43 AM
> To: java-user [at] lucene; Bill Chesky
> Subject: Re: Analyzer on query question
>
> On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky
> <Bill.Chesky [at] learninga-z> wrote:
>> Hi,
>>
>> I understand that generally speaking you should use the same analyzer on querying as was used on indexing. In my code I am using the SnowballAnalyzer on index creation. However, on the query side I am building up a complex BooleanQuery from other BooleanQuerys and/or PhraseQuerys on several fields. None of these require specifying an analyzer anywhere. This is causing some odd results, I think, because a different analyzer (or no analyzer?) is being used for the query.
>>
>> Question: how do I build my boolean and phrase queries using the SnowballAnalyzer?
>>
>> One thing I did that seemed to kind of work was to build my complex query normally then build a snowball-analyzed query using a QueryParser instantiated with a SnowballAnalyzer. To do this, I simply pass the string value of the complex query to the QueryParser.parse() method to get the new query. Something like this:
>>
>> // build a complex query from other BooleanQuerys and PhraseQuerys
>> BooleanQuery fullQuery = buildComplexQuery();
>> QueryParser parser = new QueryParser(Version.LUCENE_30, "title", new SnowballAnalyzer(Version.LUCENE_30, "English"));
>> Query snowballAnalyzedQuery = parser.parse(fullQuery.toString());
>>
>> TopScoreDocCollector collector = TopScoreDocCollector.create(10000, true);
>> indexSearcher.search(snowballAnalyzedQuery, collector);
>
> you can just use the analyzer directly like this:
> Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English");
>
> TokenStream stream = analyzer.tokenStream("title", new
> StringReader(fullQuery.toString()):
> CharTermAttribute termAttr = stream.addAttribute(CharTermAttribute.class);
> stream.reset();
> BooleanQuery q = new BooleanQuery();
> while(stream.incrementToken()) {
> q.addClause(new BooleanClause(Occur.MUST, new Term("title",
> termAttr.toString())));
> }
>
> you also have access to the token positions if you want to create
> phrase queries etc. just add a PositionIncrementAttribute like this:
> PositionIncrementAttribute posAttr =
> stream.addAttribute(PositionsIncrementAttribute.class);
>
> pls. doublecheck the code it's straight from the top of my head.
>
> simon
>
>>
>> Like I said, this seems to kind of work but it doesn't feel right. Does this make sense? Is there a better way?
>>
>> thanks in advance,
>>
>> Bill
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Subject User Time
Analyzer on query question Bill.Chesky at learninga-z Aug 2, 2012, 2:09 PM
    Re: Analyzer on query question simon.willnauer at gmail Aug 3, 2012, 12:42 AM
    RE: Analyzer on query question Bill.Chesky at learninga-z Aug 3, 2012, 6:19 AM
    Re: Analyzer on query question ian.lea at gmail Aug 3, 2012, 6:31 AM
    Re: Analyzer on query question jack at basetechnology Aug 3, 2012, 6:32 AM
        RE: Analyzer on query question Bill.Chesky at learninga-z Aug 3, 2012, 9:53 AM
    RE: Analyzer on query question Bill.Chesky at learninga-z Aug 3, 2012, 9:57 AM
    Re: Analyzer on query question ian.lea at gmail Aug 3, 2012, 10:12 AM
    Re: Analyzer on query question jack at basetechnology Aug 3, 2012, 10:22 AM
        RE: Analyzer on query question Bill.Chesky at learninga-z Aug 3, 2012, 11:55 AM
    Re: Analyzer on query question jack at basetechnology Aug 3, 2012, 1:03 PM
        Re: Analyzer on query question rcmuir at gmail Aug 3, 2012, 1:39 PM
        RE: Analyzer on query question Bill.Chesky at learninga-z Aug 3, 2012, 2:35 PM
    Re: Analyzer on query question ian.lea at gmail Aug 3, 2012, 2:03 PM
    Re: Analyzer on query question jack at basetechnology Aug 3, 2012, 2:48 PM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.