
jira at apache
Feb 4, 2012, 8:52 AM
Post #12 of 12
(48 views)
Permalink
|
|
[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators
[In reply to]
|
|
[ https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200476#comment-13200476 ] Brian Carver commented on SOLR-2649: ------------------------------------ I'm new to solr, so I have a tenuous grasp on some of these issues, but I've understood boolean logic for a couple of decades and it seems to me like solr's current behavior is thwarting the expectations of those who understand what they want and explicitly ask for it. Mike's example above is what troubles me. Principles: 1. The maintainer sets whitespace to be interpreted as AND or OR and solr should do nothing to change that in particular instances. 2. Where a user inputs an ambiguous query, a default rule about how operator scope will work is needed and that also should not be changed in particular instances. So, Mike says he sets whitespace to AND, users know this, and then a user enters: Example 1: (A or B or C) "D E" Given the above assumptions, the only reasonable interpretation of this is: (A or B or C) AND "D E" which is a conjunction with two conjuncts, both of which must be satisfied for a result to be produced, yet Mike/the user gets results that only satisfy one of the conjuncts. That shouldn't happen. I'd agree though that how to understand/apply mm in some of the examples above creates hard questions, but that is why many search engines provide two interfaces, one "natural language" interface and one that requires strict use of boolean syntax. Allowing people to enter some boolean operators (which they're going to expect will be respected-no-matter-what) and simultaneously interpreting their query using mm handlers intended for a more rough-and-ready approach is just going to lead to confused end users most of the time. So, in some ways, ignoring mm when operators are used is a feature, not a bug, but that seems orthogonal to the completely unacceptable outcome Mike described: whatever is causing THAT, is a bug. > MM ignored in edismax queries with operators > -------------------------------------------- > > Key: SOLR-2649 > URL: https://issues.apache.org/jira/browse/SOLR-2649 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 3.3 > Reporter: Magnus Bergmark > Priority: Minor > > Hypothetical scenario: > 1. User searches for "stocks oil gold" with MM set to "50%" > 2. User adds "-stockings" to the query: "stocks oil gold -stockings" > 3. User gets no hits since MM was ignored and all terms where AND-ed together > The behavior seems to be intentional, although the reason why is never explained: > // For correct lucene queries, turn off mm processing if there > // were explicit operators (except for AND). > boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; > (lines 232-234 taken from tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java) > This makes edismax unsuitable as an replacement to dismax; mm is one of the primary features of dismax. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe [at] lucene For additional commands, e-mail: dev-help [at] lucene
|