Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Query not finding indexed data

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


adb at teamware

Oct 15, 2006, 8:08 PM

Post #1 of 4 (375 views)
Permalink
Query not finding indexed data

Hi,

I have a field "attname" that is indexed with Field.Store.YES,
Field.Index.UN_TOKENIZED. I have a document with the attname of
"IqTstAdminGuide2.pdf".

QueryParser parser = new QueryParser("body", new StandardAnalyzer());
Query query = parser.parse("attname:IqTstAdminGuide2.pdf");

fails to find the Document, which I guess is because of StandardAnalyzer
lowercasing the filename.

How can one instruct the QueryParser only to use the Analyzer to analyse fields
in an expression that were tokenized during the indexing process and to not
analyse those that were UN_TOKENIZED?

Regards
Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


DORONC at il

Oct 15, 2006, 9:11 PM

Post #2 of 4 (363 views)
Permalink
Re: Query not finding indexed data [In reply to]

Hi Antony, you cannot instruct the query parser to do that. Note that an
application can add both tokenized and un_tokenized data under the same
field name. This is an application logic to know that a certain query is
not to be tokenized. In this case you could create your query with:
query = new TermQuery(fieldName, "IqTstAdminGuide2.pdf");

Hope this helps,
Doron

Antony Bowesman <adb [at] teamware> wrote on 15/10/2006 20:08:37:
> Hi,
>
> I have a field "attname" that is indexed with Field.Store.YES,
> Field.Index.UN_TOKENIZED. I have a document with the attname of
> "IqTstAdminGuide2.pdf".
>
> QueryParser parser = new QueryParser("body", new StandardAnalyzer());
> Query query = parser.parse("attname:IqTstAdminGuide2.pdf");
>
> fails to find the Document, which I guess is because of StandardAnalyzer
> lowercasing the filename.
>
> How can one instruct the QueryParser only to use the Analyzer to
> analyse fields
> in an expression that were tokenized during the indexing process and to
not
> analyse those that were UN_TOKENIZED?
>
> Regards
> Antony
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


adb at teamware

Oct 15, 2006, 11:44 PM

Post #3 of 4 (353 views)
Permalink
Re: Query not finding indexed data [In reply to]

Doron Cohen wrote:
> Hi Antony, you cannot instruct the query parser to do that. Note that an

Thanks, I suspected as much. I've changed it to make the field tokenized.

> field name. This is an application logic to know that a certain query is
> not to be tokenized. In this case you could create your query with:
> query = new TermQuery(fieldName, "IqTstAdminGuide2.pdf");

The query is user driven, so I can't know without parsing whether it should be
tokenised or not. I would have to extend the parser to make use of TermQuery -
it's easier just to tokenize the field now I understand Lucene's behaviour.

Regards
Antony



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


erik at ehatchersolutions

Oct 16, 2006, 1:24 AM

Post #4 of 4 (366 views)
Permalink
Re: Query not finding indexed data [In reply to]

On Oct 16, 2006, at 2:44 AM, Antony Bowesman wrote:

> Doron Cohen wrote:
>> Hi Antony, you cannot instruct the query parser to do that. Note
>> that an
>
> Thanks, I suspected as much. I've changed it to make the field
> tokenized.
>
>> field name. This is an application logic to know that a certain
>> query is
>> not to be tokenized. In this case you could create your query with:
>> query = new TermQuery(fieldName, "IqTstAdminGuide2.pdf");
>
> The query is user driven, so I can't know without parsing whether
> it should be tokenised or not. I would have to extend the parser
> to make use of TermQuery - it's easier just to tokenize the field
> now I understand Lucene's behaviour.

You can also use PerFieldAnalyzerWrapper as the analyzer for
QueryParser, and for all your untokenized fields, specify a
KeywordAnalyzer. That will keep untokenized fields from being split
(as best it can given QueryParser meta-syntax).

Erik



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.