Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf)

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Jan 20, 2012, 7:44 PM

Post #1 of 11 (54 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf)

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190304#comment-13190304 ]

Hoss Man commented on SOLR-3026:
--------------------------------

It's been so long, i don't really remember what i envisioned.

I haven't had a chance to review the patches, but your description of the usecases looks great -- my personal preference would be for an empty uf to default to not allowing any explicit fields, but i know i'm in the minority on that opinion, and your "-*" exclusion syntax makes it so easy to do i have absolutely no complaints.

as for field name aliasing / virtual fields (ie: SOLR-3045) ... as i remember it, the underling "Alias" feature of the DisjunctionMaxQueryParser (i think that's what it's called) should work well for that -- assuming the edismax usage of that underlying QueryParser doesn't circumvent it too much.

As far as user syntax goes, i would suggest that the "per-field override" param syntax on the "qf" param would probably make the most sense here instead of using colons (and wouldn't require the special comma syntax you suggest in SOLR-3045 to specify multiple fields, which would prevent the general change yonik seems to want)

ie...

{noformat}
q=elephant title:dumbo who:george
&qf=title^3 firstname lastname^2 description^2 catchall
&uf=title^5 who^2 *
&f.who.qf=firstname lastname^10
{noformat}

...would cause "elephant" to be searched in all the "qf" fields with the specified boosts; "dumbo" to be searched only against the title field (with a boost of 5 since the user asked for that field explicitly); and "george" will get a DisjunctionMaxQuery with a boost of 2, containing two clauses: firstname (default boost of 1) and lastname (boost of 10).

Basically: when parsing the "uf" look for a "f.${uf}.qf" param, and if it exists parse it and add the appropriate Alias. (fingers crossed it will be that easy: if it isn't, it's probably a feature!)

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Jan 21, 2012, 3:14 AM

Post #2 of 11 (47 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190369#comment-13190369 ]

Jan Høydahl commented on SOLR-3026:
-----------------------------------

I like the f.who.qf style. And the fact that you then can boost the whole DMQ clause as a whole.. I'll add that to SOLR-3045 as a suggestion.

But it's a bit overkill to spin a DMQ for simple single-field-aliasing, i.e. my example &uf=title:searchable_title_t.
Ideally such a simple field name aliasing should be supported on the Lucene parser level.
Alternatively it could be another per-field param
{noformat}
&f.title.fmap=searchable_title_t
{noformat}

I'm still not sure how to use the built-in aliasing to implement this

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Jan 24, 2012, 8:10 AM

Post #3 of 11 (47 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192238#comment-13192238 ]

Jan Høydahl commented on SOLR-3026:
-----------------------------------

I think this is a good enough first step for user fields and aliasing feature. Then if we want to take it further such as recursive aliasing from inside qf params, we'll open new issues.

Not sure I got full test coverage for all combinations of fields, default boosts etc.

Anyone wants to review it? There's bound to be some bugs with such string manipulations...

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Jan 24, 2012, 8:20 AM

Post #4 of 11 (48 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192250#comment-13192250 ]

Jan Høydahl commented on SOLR-3026:
-----------------------------------

One more thing we should probably fix. For schemas relying heavily on <dynamicField>s, it could be handy to allow/deny wildcard field names. Imagine:
{noformat}
<dynamicField name="*name" type="string" />
<dynamicField name="secret*" type="string" />

With today's patch you'd have to explicitly allow and disallow full field names:
&uf=firstname middlename lastname companyname -secrettext -secretsalary -secretfoo ....

Better would be:
&uf=*name -secret*
{noformat}

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Jan 27, 2012, 1:19 PM

Post #5 of 11 (44 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195107#comment-13195107 ]

Tomás Fernández Löbbe commented on SOLR-3026:
---------------------------------------------

Jan, I have a patch that fixes an issue with this implementation and adds some more test cases. It also addresses SOLR-3045, should I add it here or should I try to separate both patches? What I did for SOLR-3045 strongly depends in your code for this issue.
The issue is for a case like:
myalias:(Zapp Obnoxious)
This query is parsed as
myalias:Zapp default_field:Obnoxious
instead of
myalias:Zapp myalias:Obnoxious

the specific tests I added are:

assertQ(req("defType","edismax", "uf", "myalias", "q","myalias:(Zapp Obnoxious)", "f.myalias.qf","name^2.0 mytrait_ss^5.0", "mm", "50%"), oner);

assertQ(req("defType","edismax", "uf","who", "q","who:(Zapp Obnoxious)", "f.who.qf", "name^2.0 trait_ss^5.0", "qf", "id"), twor);




> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Jan 27, 2012, 4:11 PM

Post #6 of 11 (44 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195257#comment-13195257 ]

Jan Høydahl commented on SOLR-3026:
-----------------------------------

Hi Tomás. Thanks for the involvement!

I have a feeling that we'll close SOLR-3045 and do everything in this issue.

Please upload your improvements here, with same patch name, and we'll continue from there.

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Jan 30, 2012, 3:03 PM

Post #7 of 11 (40 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196516#comment-13196516 ]

Jan Høydahl commented on SOLR-3026:
-----------------------------------

Hi, could you re-upload your patch without unrelated changes? Your patch includes severalwhite space/indenting/reformatting changes which is unrelated. This makes it hard to read the patch and see what's new.

See http://wiki.apache.org/solr/HowToContribute#Creating_the_patch_file under "Please do not".

How do you trigger an infinite loop?

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 2, 2012, 1:28 PM

Post #8 of 11 (39 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199240#comment-13199240 ]

Jan Høydahl commented on SOLR-3026:
-----------------------------------

Super duper. I tested it and it works great! Strange, I could not get aliasing for qf fields to work before now. Now it works like a charm.

A possible optimization would be to detect if the f.who.qf= contains just a single field, and create a simple TermQuery(?) instead of a DisMaxQuery in that case. But it might not be important for performance..

Another thing to re-consider is whether the default should be {{uf=*}} or {{uf=-*}}. If we aim to let edismax replace dismax, people may want it to behave like dismax out of the box. But if it won't replace dismax it's better to stick with the current defaults which people already are used to. Note that since eDismax is still @lucene.experimental we should not be afraid to change defaults.

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 3, 2012, 2:51 AM

Post #9 of 11 (39 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199636#comment-13199636 ]

Tomás Fernández Löbbe commented on SOLR-3026:
---------------------------------------------

I think the default should be uf=\*, otherwise it will be confusing. I think "field" search together with "dismax" search will be one of the main reasons why people will move from other QP to edismax, and with uf=-\* they will not get that behavior until they explicitly change it. I bet that if we use uf=-\* we'll get many questions related to this in the mailing list.

About the optimization, I think its a good idea, however it should be a different Jira. The optimization could be applied to f.who.qf as well as to qf= right?

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 3, 2012, 1:34 PM

Post #10 of 11 (41 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200067#comment-13200067 ]

Hoss Man commented on SOLR-3026:
--------------------------------

bq. If we aim to let edismax replace dismax, people may want it to behave like dismax out of the box

I don't think that should be the goal. plenty of people are using "edismax" already because they like the fact that it is a super set of the dismax & lucene features, and the defaults for "edismax" should embrace that.

if/when EDisMaxQParser reaches the point that it can be configured to work exactly the same as DisMaxQParser, then it may be worth considering defaulting "dismax" => an EDisMaxQParser instance configured that way, but that doesn't mean "edismax" shouldn't expose all of it's bells and whistles by default.

uf=* as a default should be fine -- the only reason to question it would be if it was hard to disable, but the "-*" syntax is so easy it's not worth worrying about it.

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 13, 2012, 11:50 AM

Post #11 of 11 (29 views)
Permalink
[jira] [Commented] (SOLR-3026) eDismax: Locking down which fields can be explicitly queried (user fields aka uf) [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207137#comment-13207137 ]

Tomás Fernández Löbbe commented on SOLR-3026:
---------------------------------------------

Is there anything else to do in order to get this committed?

> eDismax: Locking down which fields can be explicitly queried (user fields aka uf)
> ---------------------------------------------------------------------------------
>
> Key: SOLR-3026
> URL: https://issues.apache.org/jira/browse/SOLR-3026
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: 3.1, 3.2, 3.3, 3.4, 3.5
> Reporter: Jan Høydahl
> Assignee: Jan Høydahl
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch, SOLR-3026.patch
>
>
> We need a way to specify exactly what fields should be available to the end user as fielded search.
> In the original SOLR-1553, there's a patch implementing "user fields", but it was never committed even if that issue was closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.