Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] Updated: (LUCENE-2062) Bulgarian Analyzer

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Nov 29, 2009, 11:17 PM

Post #1 of 4 (309 views)
Permalink
[jira] Updated: (LUCENE-2062) Bulgarian Analyzer

[ https://issues.apache.org/jira/browse/LUCENE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2062:
--------------------------------

Attachment: LUCENE-2062.patch

some improvements on the previous patch, mostly changing the test to work in a similar way to TestCzechStemmer, refining stopwords list, javadocs, etc.

I think this one is ready. I'll commit in a few days if no one objects.


> Bulgarian Analyzer
> ------------------
>
> Key: LUCENE-2062
> URL: https://issues.apache.org/jira/browse/LUCENE-2062
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/analyzers
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2062.patch, LUCENE-2062.patch
>
>
> someone asked about bulgarian analysis on solr-user today... http://www.lucidimagination.com/search/document/e1e7a5636edb1db2/non_english_languages
> I was surprised we did not have anything.
> This analyzer implements the algorithm specified here, http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf
> In the measurements there, this improves MAP approx 34%

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Dec 1, 2009, 9:22 AM

Post #2 of 4 (271 views)
Permalink
[jira] Updated: (LUCENE-2062) Bulgarian Analyzer [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2062:
------------------------------------

Attachment: LUCENE-2062.patch

I updated the patch to the current trunk and added some final keywords here and there.
IMO, BulgarianStemmer should be final but that is not much of a deal. We might do that with all the stemmers to be consistent.

the stemmer looks good to me - all test pass

> Bulgarian Analyzer
> ------------------
>
> Key: LUCENE-2062
> URL: https://issues.apache.org/jira/browse/LUCENE-2062
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/analyzers
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2062.patch, LUCENE-2062.patch, LUCENE-2062.patch
>
>
> someone asked about bulgarian analysis on solr-user today... http://www.lucidimagination.com/search/document/e1e7a5636edb1db2/non_english_languages
> I was surprised we did not have anything.
> This analyzer implements the algorithm specified here, http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf
> In the measurements there, this improves MAP approx 34%

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Dec 1, 2009, 9:24 AM

Post #3 of 4 (270 views)
Permalink
[jira] Updated: (LUCENE-2062) Bulgarian Analyzer [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Willnauer updated LUCENE-2062:
------------------------------------

Attachment: LUCENE-2062.patch

... and forgot the resource part.

sorry about that

> Bulgarian Analyzer
> ------------------
>
> Key: LUCENE-2062
> URL: https://issues.apache.org/jira/browse/LUCENE-2062
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/analyzers
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2062.patch, LUCENE-2062.patch, LUCENE-2062.patch, LUCENE-2062.patch
>
>
> someone asked about bulgarian analysis on solr-user today... http://www.lucidimagination.com/search/document/e1e7a5636edb1db2/non_english_languages
> I was surprised we did not have anything.
> This analyzer implements the algorithm specified here, http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf
> In the measurements there, this improves MAP approx 34%

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene


jira at apache

Dec 1, 2009, 10:02 AM

Post #4 of 4 (272 views)
Permalink
[jira] Updated: (LUCENE-2062) Bulgarian Analyzer [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2062:
--------------------------------

Attachment: LUCENE-2062.patch

Simon, thanks! here is patch with NOTICE, etc.

If no one objects I will commit it tomorrow.

> Bulgarian Analyzer
> ------------------
>
> Key: LUCENE-2062
> URL: https://issues.apache.org/jira/browse/LUCENE-2062
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/analyzers
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2062.patch, LUCENE-2062.patch, LUCENE-2062.patch, LUCENE-2062.patch, LUCENE-2062.patch
>
>
> someone asked about bulgarian analysis on solr-user today... http://www.lucidimagination.com/search/document/e1e7a5636edb1db2/non_english_languages
> I was surprised we did not have anything.
> This analyzer implements the algorithm specified here, http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf
> In the measurements there, this improves MAP approx 34%

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.