Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Unexpected Query Results

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


jamie at stimulussoft

Feb 3, 2010, 11:39 PM

Post #1 of 5 (882 views)
Permalink
Unexpected Query Results

Hi

I have some unexpected query results.

When attempting two queries:

1) All fields, exact phrase query returns 48 hits

(priority:"было время" attach:"было время" score:"было время" size:"было
время" sentdate:"было время" archivedate:"было время" receiveddate:"было
время" from:"было время" to:"было время" subject:"было время" cc:"было
время" bcc:"было время" deliveredto:"было время" flag:"было время"
sensitivity:"было время" sender:"было время" recipient:"было время"
body:"было время" attachments:"было время" attachname:"было время"
memberof:"было время")

2) All fields, All words query return 0 hits

((priority:было attach:было score:было size:было sentdate:было
archivedate:было receiveddate:было from:было to:было subject:было
cc:было bcc:было deliveredto:было flag:было sensitivity:было sender:было
recipient:было body:было attachments:было attachname:было memberof:было
) AND (priority:время attach:время score:время size:время sentdate:время
archivedate:время receiveddate:время from:время to:время subject:время
cc:время bcc:время deliveredto:время flag:время sensitivity:время
sender:время recipient:время body:время attachments:время
attachname:время memberof:время ))

I am not sure why query 2 returns 0 hits. In my mind it should return 48
hits as in query (1).

I am using Lucene 3.0. Is there a pseudo field for all search terms?

Thanks

Jamie





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


ian.lea at gmail

Feb 4, 2010, 3:43 AM

Post #2 of 5 (836 views)
Permalink
Re: Unexpected Query Results [In reply to]

There is no pseudo field for all search terms. 2 common practices are
to use MultiFieldQueryParser or to add a catch-all field. I tend to
do the latter.

At a glance I'd agree that the second query should also return 48
hits. Maybe a small self-contained test case or standalone program
would be easier to debug.


--
Ian.


On Thu, Feb 4, 2010 at 7:39 AM, Jamie <jamie [at] stimulussoft> wrote:
> Hi
>
> I have some unexpected query results.
>
> When attempting two queries:
>
> 1) All fields, exact phrase query returns 48 hits
>
> (priority:"было время" attach:"было время" score:"было время" size:"было
> время" sentdate:"было время" archivedate:"было время" receiveddate:"было
> время" from:"было время" to:"было время" subject:"было время" cc:"было
> время" bcc:"было время" deliveredto:"было время" flag:"было время"
> sensitivity:"было время" sender:"было время" recipient:"было время"
> body:"было время" attachments:"было время" attachname:"было время"
> memberof:"было время")
>
> 2) All fields, All words query return 0 hits
>
> ((priority:было attach:было score:было size:было sentdate:было
> archivedate:было receiveddate:было from:было to:было subject:было cc:было
> bcc:было deliveredto:было flag:было sensitivity:было sender:было
> recipient:было body:было attachments:было attachname:было memberof:было )
> AND (priority:время attach:время score:время size:время sentdate:время
> archivedate:время receiveddate:время from:время to:время subject:время
> cc:время bcc:время deliveredto:время flag:время sensitivity:время
> sender:время recipient:время body:время attachments:время attachname:время
> memberof:время ))
>
> I am not sure why query 2 returns 0 hits. In my mind it should return 48
> hits as in query (1).
>
> I am using Lucene 3.0. Is there a pseudo field for all search terms?
>
> Thanks
>
> Jamie
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>


sarowe at syr

Feb 4, 2010, 9:28 AM

Post #3 of 5 (826 views)
Permalink
RE: Unexpected Query Results [In reply to]

Hi Jamie,

Since phrase query terms aren't analyzed, you're getting exact matches for terms "было" and "время", but when you search for them individually, they are analyzed, and it is the analyzed query terms that fail to match against the indexed terms. Sounds to me like your index-time and query-time analyzers are not the same. Maybe you have a stemming filter for query-time, but not for index-time?

Steve

On 02/04/2010 at 2:39 AM, Jamie wrote:
> Hi
>
> I have some unexpected query results.
>
> When attempting two queries:
>
> 1) All fields, exact phrase query returns 48 hits
>
> (priority:"было время" attach:"было время" score:"было время" size:"было
> время" sentdate:"было время" archivedate:"было время" receiveddate:"было
> время" from:"было время" to:"было время" subject:"было время" cc:"было
> время" bcc:"было время" deliveredto:"было время" flag:"было время"
> sensitivity:"было время" sender:"было время" recipient:"было время"
> body:"было время" attachments:"было время" attachname:"было время"
> memberof:"было время")
>
> 2) All fields, All words query return 0 hits
>
> ((priority:было attach:было score:было size:было sentdate:было
> archivedate:было receiveddate:было from:было to:было subject:было
> cc:было bcc:было deliveredto:было flag:было sensitivity:было sender:было
> recipient:было body:было attachments:было attachname:было memberof:было
> ) AND (priority:время attach:время score:время size:время sentdate:время
> archivedate:время receiveddate:время from:время to:время subject:время
> cc:время bcc:время deliveredto:время flag:время sensitivity:время
> sender:время recipient:время body:время attachments:время
> attachname:время memberof:время ))
>
> I am not sure why query 2 returns 0 hits. In my mind it should return 48
> hits as in query (1).
>
> I am using Lucene 3.0. Is there a pseudo field for all search terms?
>
> Thanks
>
> Jamie
>
>
>
>
>
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: java-user-unsubscribe [at] lucene For
> additional commands, e-mail: java-user-help [at] lucene


hossman_lucene at fucit

Feb 4, 2010, 12:24 PM

Post #4 of 5 (817 views)
Permalink
RE: Unexpected Query Results [In reply to]

: Since phrase query terms aren't analyzed, you're getting exact matches

quoted phrase passed to the QueryParser are analyzed -- but they are
analyzed as complete strings, so Analyzers that treat whitespace special
may produce differnet Terms then if the individual "words" were analyzed
individually (which is what happens when QueryParser is given multiple
"words" that aren't in a quoted phrase)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


sarowe at syr

Feb 4, 2010, 1:00 PM

Post #5 of 5 (819 views)
Permalink
RE: Unexpected Query Results [In reply to]

On 02/04/2010 at 3:24 PM, Chris Hostetter wrote:
> : Since phrase query terms aren't analyzed, you're getting exact
> : matches
>
> quoted phrase passed to the QueryParser are analyzed -- but they are
> analyzed as complete strings, so Analyzers that treat whitespace
> special may produce differnet Terms then if the individual "words"
> were analyzed individually (which is what happens when QueryParser
> is given multiple "words" that aren't in a quoted phrase)

Yikes, you're right, of course (I just checked the code) - I was thinking of contrib/misc/AnalyzingQueryParser, which adds analysis to fuzzy, prefix, range, and wildcard queries, since *those* are not analyzed by QueryParser, and had added phrases to that list in my model of reality...



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.