Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] [Updated] (SOLR-3017) Allow edismax stopword filter factory implementation to be specified

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Feb 1, 2012, 6:23 PM

Post #1 of 2 (33 views)
Permalink
[jira] [Updated] (SOLR-3017) Allow edismax stopword filter factory implementation to be specified

[ https://issues.apache.org/jira/browse/SOLR-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Dodsworth updated SOLR-3017:
------------------------------------

Attachment: SOLR-3017-without-guava-alternative.patch

Alternative patch with guava calls replaced with StringUtils

> Allow edismax stopword filter factory implementation to be specified
> --------------------------------------------------------------------
>
> Key: SOLR-3017
> URL: https://issues.apache.org/jira/browse/SOLR-3017
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Michael Dodsworth
> Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3017-without-guava-alternative.patch, SOLR-3017.patch, edismax_stop_filter_factory.patch
>
>
> Currently, the edismax query parser assumes that stopword filtering is being done by StopFilter: the removal of the stop filter is performed by looking for an instance of 'StopFilterFactory' (hard-coded) within the associated field's analysis chain.
> We'd like to be able to use our own stop filters whilst keeping the edismax stopword removal goodness. The supplied patch allows the stopword filter factory class to be supplied as a param, "stopwordFilterClassName". If no value is given, the default (StopFilterFactory) is used.
> Another option I looked into was to extend StopFilterFactory to create our own filter. Unfortunately, StopFilterFactory's 'create' method returns StopFilter, not TokenStream. StopFilter is also final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Feb 3, 2012, 9:37 AM

Post #2 of 2 (27 views)
Permalink
[jira] [Updated] (SOLR-3017) Allow edismax stopword filter factory implementation to be specified [In reply to]

[ https://issues.apache.org/jira/browse/SOLR-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson updated SOLR-3017:
---------------------------------

Attachment: SOLR-3017.patch

new version that:
1> removes the new schema file and just modifies schema12 instead. All tests pass with this change.

2> Adds null check to setStopwordFilterFactoryClass rather than where it's called.

I guess theoretically someone could override this class, override setStopwordFilterFactoryClass, call it with null and set the member var to null then encounter an NPE in noStopwordFilterAnalyzer which they couldn't fix due to scope issues. But that doesn't sound like something we need to guard against at this point.

If nobody objects, I'll commit this over the weekend or early next week.

> Allow edismax stopword filter factory implementation to be specified
> --------------------------------------------------------------------
>
> Key: SOLR-3017
> URL: https://issues.apache.org/jira/browse/SOLR-3017
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Michael Dodsworth
> Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3017-without-guava-alternative.patch, SOLR-3017.patch, SOLR-3017.patch, edismax_stop_filter_factory.patch
>
>
> Currently, the edismax query parser assumes that stopword filtering is being done by StopFilter: the removal of the stop filter is performed by looking for an instance of 'StopFilterFactory' (hard-coded) within the associated field's analysis chain.
> We'd like to be able to use our own stop filters whilst keeping the edismax stopword removal goodness. The supplied patch allows the stopword filter factory class to be supplied as a param, "stopwordFilterClassName". If no value is given, the default (StopFilterFactory) is used.
> Another option I looked into was to extend StopFilterFactory to create our own filter. Unfortunately, StopFilterFactory's 'create' method returns StopFilter, not TokenStream. StopFilter is also final.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.