Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Feb 2, 2012, 10:45 AM

Post #1 of 5 (32 views)
Permalink
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace

[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199076#comment-13199076 ]

Hoss Man commented on SOLR-3047:
--------------------------------

I can't make heads of tails of this bug report ... at a minimum we need to see...

* what the full request params look like for an example request
* what the debugQuery output looks like for an example request (including the echoParams and query parsing info
* how the requesthandler in use is configured
* the fieled and filedtype information for every field used by dismax (ie: mentioned in the request params or request handler defaults)

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
> Issue Type: Bug
> Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It drops it without warning - IMO it should error out if the field isn't compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 2, 2012, 11:12 AM

Post #2 of 5 (32 views)
Permalink
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199111#comment-13199111 ]

Antony Stubbs commented on SOLR-3047:
-------------------------------------

Ok, I'll try to reproduce it in a simple setup.

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
> Issue Type: Bug
> Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It drops it without warning - IMO it should error out if the field isn't compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 2, 2012, 11:18 AM

Post #3 of 5 (31 views)
Permalink
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199119#comment-13199119 ]

Antony Stubbs commented on SOLR-3047:
-------------------------------------

Hoss, is there a way I can send you the example privately?

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
> Issue Type: Bug
> Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It drops it without warning - IMO it should error out if the field isn't compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 2, 2012, 11:22 AM

Post #4 of 5 (31 views)
Permalink
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199125#comment-13199125 ]

Antony Stubbs commented on SOLR-3047:
-------------------------------------

It just seems that if the field tokeniser only produces a single token (as keywordtokenizer produces), it gets silently dropped from phrase list queries (even though multiple tokens are produced by the ngramfilter in the end). If I just change the tokenizer to standard, it works as expected.

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
> Issue Type: Bug
> Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It drops it without warning - IMO it should error out if the field isn't compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 3, 2012, 2:58 PM

Post #5 of 5 (25 views)
Permalink
[jira] [Commented] (SOLR-3047) DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200130#comment-13200130 ]

Hoss Man commented on SOLR-3047:
--------------------------------

bq. Hoss, is there a way I can send you the example privately?

[I'd rather not|https://people.apache.org/~hossman/#private_q]

if you can't share the configs you are using, can't you at least add a quick example of something demonstrating your problem to the example schemx.xml and post that?

I just tried this example from Solr 3.5.0 (alphaNameSort uses KeywordTokenizer) and got exactly what i expected...

{code}
http://localhost:8983/solr/select?debugQuery=true&defType=dismax&qf=name&pf=alphaNameSort&q=foo%20bar%20baz

+((DisjunctionMaxQuery((name:foo))
DisjunctionMaxQuery((name:bar))
DisjunctionMaxQuery((name:baz))
)~3
)
DisjunctionMaxQuery((alphaNameSort:foobarbaz))
{code}

> DisMaxQParserPlugin drops my field in the phrase field list (pf) if it uses KeywordTokenizer instead of StandardTokenizer or Whitespace
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-3047
> URL: https://issues.apache.org/jira/browse/SOLR-3047
> Project: Solr
> Issue Type: Bug
> Reporter: Antony Stubbs
>
> Has this got something to do with the minimum clause = 2 part in the code? It drops it without warning - IMO it should error out if the field isn't compatible.
> If it is on purpose - i don't see why. I split with the ngram token filter, so there is def more than 1 clause in the indexed field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.