Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Unexpected results searching for phrase with stop words

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


simon at thegestalt

Nov 12, 2009, 12:00 PM

Post #1 of 4 (518 views)
Permalink
Unexpected results searching for phrase with stop words

I have a document with the title "Here, there be dragons" and a body.

When I search for

Here, there be dragons
(no quotes)

with a title boost of 2.0 and a body boost of 0.8

I get the document as the first hit which is what I'd expect.

However, if change the query to

"Here, there be dragons"
(with quotes)

then I don't get the document at all. Which is not what I'd expect.

I've tried modifying the phrase slop but still don't get any results
back.

Am I doing something wrong? I suspect it's something to do with the
number of stop words in the query. Do I have to have an untokenized copy
of the title field lying around to search on?

Thanks,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


erickerickson at gmail

Nov 12, 2009, 12:15 PM

Post #2 of 4 (495 views)
Permalink
Re: Unexpected results searching for phrase with stop words [In reply to]

Yes, you're doing something wrong <G>. What, you may ask?
Well, it's kind of hard to say without knowing what analyzers
you use at index AND query time and what the query you're
submitting looks like...

But the very first thing I'd try is to get a copy of Luke and peek at your
index to see if what you *think* is in there actually is.

Then, I'd run some queries through Luke (you can choose various
analyzers) and see what the query looks like after it's parsed.

In parallel, try query.toString() to see what the parsed query looks like,
it
might surprise you.

Best
Erick


On Thu, Nov 12, 2009 at 3:00 PM, Simon Wistow <simon [at] thegestalt> wrote:

> I have a document with the title "Here, there be dragons" and a body.
>
> When I search for
>
> Here, there be dragons
> (no quotes)
>
> with a title boost of 2.0 and a body boost of 0.8
>
> I get the document as the first hit which is what I'd expect.
>
> However, if change the query to
>
> "Here, there be dragons"
> (with quotes)
>
> then I don't get the document at all. Which is not what I'd expect.
>
> I've tried modifying the phrase slop but still don't get any results
> back.
>
> Am I doing something wrong? I suspect it's something to do with the
> number of stop words in the query. Do I have to have an untokenized copy
> of the title field lying around to search on?
>
> Thanks,
>
> Simon
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


uwe at thetaphi

Nov 12, 2009, 12:20 PM

Post #3 of 4 (479 views)
Permalink
RE: Unexpected results searching for phrase with stop words [In reply to]

Which version of Lucene are you using and which Version constant do you pass
to Analyzer and Query Parser? In 2.9.0 there was a bug/incorrect setting
between the query parser and the Version.LUCENE_CURRENT / Version.LUCENE_29
setting. If you did not enable position increments in query parser, that
setting wouldn't work. In 2.9.1 it is fixed, if you pass the same version
constant to both analyzer and query parser.

Also the analyzers on indexing and query side must be identical or
compatible.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe [at] thetaphi


> -----Original Message-----
> From: Simon Wistow [mailto:simon [at] thegestalt]
> Sent: Thursday, November 12, 2009 9:00 PM
> To: Lucene
> Subject: Unexpected results searching for phrase with stop words
>
> I have a document with the title "Here, there be dragons" and a body.
>
> When I search for
>
> Here, there be dragons
> (no quotes)
>
> with a title boost of 2.0 and a body boost of 0.8
>
> I get the document as the first hit which is what I'd expect.
>
> However, if change the query to
>
> "Here, there be dragons"
> (with quotes)
>
> then I don't get the document at all. Which is not what I'd expect.
>
> I've tried modifying the phrase slop but still don't get any results
> back.
>
> Am I doing something wrong? I suspect it's something to do with the
> number of stop words in the query. Do I have to have an untokenized copy
> of the title field lying around to search on?
>
> Thanks,
>
> Simon
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


simon at thegestalt

Nov 12, 2009, 2:39 PM

Post #4 of 4 (487 views)
Permalink
Re: Unexpected results searching for phrase with stop words [In reply to]

On Thu, Nov 12, 2009 at 09:20:30PM +0100, Uwe Schindler said:
> Which version of Lucene are you using and which Version constant do you pass
> to Analyzer and Query Parser? In 2.9.0 there was a bug/incorrect setting
> between the query parser and the Version.LUCENE_CURRENT / Version.LUCENE_29
> setting. If you did not enable position increments in query parser, that
> setting wouldn't work. In 2.9.1 it is fixed, if you pass the same version
> constant to both analyzer and query parser.

This looks like it could be the problem - thanks for the pointers.



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.