Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Feb 29, 2012, 1:03 PM

Post #1 of 3 (73 views)
Permalink
[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

[ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cavanna updated SOLR-2719:
-------------------------------

Attachment: SOLR-2719.patch

My patch isn't a fix but just a starting point: it adds a ReturnFieldsTest class which tests some of the new fl features. Some tests are of course failing. The biggest problem is the hyphen within the field name, which I guess is widely used. This could be corrected as suggested by Nik, but we have problems with other characters, even if less used within field names.

Solr doesn't validate field names, but now a lot of potential field names can't actually be used within the fl parameter, or even worse they break the query. Some of my tests method are intentionally weird, like the ~idtest or id$test, but those field names are both allowed by Solr. I'm afraid we might have the same problem with sorting since the QueryParsing#parseSort uses the same StrParser#getId method.

The main rule to identify the end of a field name seems to be the following:
{code}
if (!Character.isJavaIdentifierPart(ch) && ch != '.')
break;
{code}

In my opinion, the point here is not just correct the hyphen regression. I think we should introduce consistency between fl and decide which characters Solr accepts within a field name.

What are your thoughts guys?

> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-2719
> URL: https://issues.apache.org/jira/browse/SOLR-2719
> Project: Solr
> Issue Type: Bug
> Components: search
> Affects Versions: 4.0
> Reporter: Nik V. Babichev
> Priority: Blocker
> Labels: field, fl, query, queryparser
> Fix For: 4.0
>
> Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Mar 6, 2012, 5:58 AM

Post #2 of 3 (68 views)
Permalink
[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cavanna updated SOLR-2719:
-------------------------------

Attachment: SOLR-2719.patch

Right Yonik, I've been misled by the
{code}
String field = sp.getId(null);
{code}
at the beginning of QueryParsing#parseSort, while the method to look at was getSimpleName. Sorting is ok (but I don't completely understand the sp.getId at the beginning).

I attached a new patch: I added to StrParser a getFieldName method and added to getId the possibility to pass a vararg parameter of allowed trailing chars.
I made also some changes to ReturnFields to make the code a little bit more readable to me, hope is the same for you guys. Tests work.
I removed the weird tests about trailing tilde and so on, kept just the trailing dash test which now works (removed @Ignore).

Let me know what you think.

I'm going to open a new issue to add field names validation.

> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-2719
> URL: https://issues.apache.org/jira/browse/SOLR-2719
> Project: Solr
> Issue Type: Bug
> Components: search
> Affects Versions: 4.0
> Reporter: Nik V. Babichev
> Priority: Blocker
> Labels: field, fl, query, queryparser
> Fix For: 4.0
>
> Attachments: SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Mar 12, 2012, 6:48 PM

Post #3 of 3 (67 views)
Permalink
[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-2719:
-------------------------------

Attachment: SOLR-2719.patch

Here's a simpler patch that tries to change less (only that first getId call).

I didn't go with the varargs version because it creates objects on every call.

> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-2719
> URL: https://issues.apache.org/jira/browse/SOLR-2719
> Project: Solr
> Issue Type: Bug
> Components: search
> Affects Versions: 4.0
> Reporter: Nik V. Babichev
> Assignee: Yonik Seeley
> Priority: Blocker
> Labels: field, fl, query, queryparser
> Fix For: 4.0
>
> Attachments: SOLR-2719.patch, SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.