Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module

 

 

First page Previous page 1 2 Next page Last page  View All Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


jira at apache

Apr 11, 2012, 8:46 PM

Post #1 of 34 (180 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252172#comment-13252172 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

Hmm looking at this ResourceLoader, what is the purpose of the newInstance method?

It seems to be unrelated to the 'resource loading', i think it should be something else?

Separately, do we have any vague idea of a plan of how WordListLoader can implement this interface?
I don't think we have to do that immediately to proceed, but long term I think it would make sense,
since currently we have some duplicate code between lucene and solr here:
* take a look at wordlistloader
* take a look at the protected methods in BaseTokenStreamFactory.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Apr 11, 2012, 9:02 PM

Post #2 of 34 (174 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252179#comment-13252179 ]

Chris Male commented on LUCENE-2510:
------------------------------------

bq. what is the purpose of the newInstance method?

If you take a look at {{org.apache.solr.analysis.DelimitedPayloadTokenFilterFactory}} you'll see an example of how it's used.

Looking at the implementation in SolrResourceLoader, it seems to facilitate two things:

- The use of simplified {{solr.*}} package names
- In {{FSTSynonymFilterFactory}} for example, newInstance is used to load other components. Consequently SolrResourceLoader adds the instantiated classes to its tracking of SolrCoreAware, ResourceLoaderAware, etc.

With all that said, its only used in 3 Factories (but a lot of other Solr code). Perhaps we can break it out somehow.

bq. Separately, do we have any vague idea of a plan of how WordListLoader can implement this interface?

I don't at this stage, but you're right, there is duplication. Off the top of my head I think we'd want to move everything over to using ResourceLoader, but somehow incorporate the WordlistLoader logic somewhere.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Apr 11, 2012, 9:36 PM

Post #3 of 34 (174 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252191#comment-13252191 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

{quote}
With all that said, its only used in 3 Factories (but a lot of other Solr code). Perhaps we can break it out somehow.
{quote}

I guess my main problem with it is the generics (it returns Object).
Seems like the generics could be fixed so its parameterized to return ? extends X.
If we add generics violations to the analyzers module, Uwe will not be happy :)

{quote}
I don't at this stage, but you're right, there is duplication. Off the top of my head I think we'd want to move everything over to using ResourceLoader, but somehow incorporate the WordlistLoader logic somewhere.
{quote}

Right I was just thinking really this stuff should be mostly in one place. I think
its a little better now but there is some stuff in both places. I guess I can let
that go, but it would be cool to have some sort of plan here, and if we don't tackle
it, at least open up a followup issue since we are talking about an interface here:
we won't be able to easy fix it without hard API breaks if we need.

Don't get me wrong: when interfaces are the right choice, we should use them without fear!
I think we just need to be extra careful up-front since we really should not break
them across minor releases.


> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Apr 11, 2012, 9:45 PM

Post #4 of 34 (175 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252194#comment-13252194 ]

Chris Male commented on LUCENE-2510:
------------------------------------

bq. I guess my main problem with it is the generics (it returns Object).
Seems like the generics could be fixed so its parameterized to return ? extends X.
If we add generics violations to the analyzers module, Uwe will not be happy

+1

I thought along the same lines so we can definitely clean it up. Wouldn't want to get a ticket from the policeman.

bq. Right I was just thinking really this stuff should be mostly in one place. I think
its a little better now but there is some stuff in both places. I guess I can let
that go, but it would be cool to have some sort of plan here, and if we don't tackle
it, at least open up a followup issue since we are talking about an interface here:
we won't be able to easy fix it without hard API breaks if we need.

I'll think on it a bit and see if anybody else has any opinions. I agree that we need to be extra careful here.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Apr 16, 2012, 7:28 PM

Post #5 of 34 (168 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255261#comment-13255261 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Err... patch is broke, will fix.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Apr 16, 2012, 7:32 PM

Post #6 of 34 (169 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255262#comment-13255262 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Actually it works, just needed to rebuild analysis/common.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

Apr 29, 2012, 8:23 PM

Post #7 of 34 (158 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264666#comment-13264666 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Need to also fix the relationship between the Solr uima contrib and analyzers-common in appropriate configuration files too.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 1, 2012, 6:22 PM

Post #8 of 34 (157 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266280#comment-13266280 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Actually I think I've been a little naive with backwards compatibility here. I think I need to ensure that any user created Factory implementations continue to work and existing schemas continue to load. Otherwise upgrading to Solr 4 is going to be an epic hassle.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 7:56 AM

Post #9 of 34 (155 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268413#comment-13268413 ]

Steven Rowe commented on LUCENE-2510:
-------------------------------------

Chris, before the changes made here, solrj did not have a dependency on lucene-core, but now it does. Is it possible to move the dependency to solr-core?

The Maven build failed today because of this <https://builds.apache.org/job/Lucene-Solr-Maven-trunk/476/consoleText>:

{quote}
[INFO] Error for project: Apache Solr Solrj (during install)
[INFO] ------------------------------------------------------------------------
[INFO] Compilation failure
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/solr/solrj/src/java/org/apache/solr/common/ResourceLoader.java:[25,71] package org.apache.lucene.analysis.util does not exist
{quote}

The short-term fix for the Maven build would be just adding a lucene-core dependency to the solrj POM.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 8:08 AM

Post #10 of 34 (152 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268424#comment-13268424 ]

Yonik Seeley commented on LUCENE-2510:
--------------------------------------

bq. before the changes made here, solrj did not have a dependency on lucene-core, but now it does
Ugh... can we avoid this extra dependency? SolrJ is just a client and is not supposed to depend on lucene or solr core.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 10:40 AM

Post #11 of 34 (156 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268565#comment-13268565 ]

Ryan McKinley commented on LUCENE-2510:
---------------------------------------

The only reason solrj has that dependency is for the deprecated interface:
{code:java}
public interface ResourceLoader extends org.apache.lucene.analysis.util.ResourceLoader
{code}

I vote we just drop ResourceLoader from the solrj client API at 4.0 rather then 5.0

alternatively, we could put the deprecated interface in solr-core, but that makes a mess of OSGI bundles (i think)


> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 5:31 PM

Post #12 of 34 (149 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268825#comment-13268825 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Ah sorry guys, I went to a lot of effort to avoid this dependency, then went and added it myself.

I still think just dropping the interface at this stage is going to make upgrading to Solr 4 a hassle so I think the best option is to move the interface into solr-core under the same package name. I don't see why that would hurt OSGI?

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 5:43 PM

Post #13 of 34 (152 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268827#comment-13268827 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Tests pass so I've moved it into solr-core to remove this dependency issue. If there is some problem related to OSGI, we can then decide if we really do want to the interface it at this stage.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 5:55 PM

Post #14 of 34 (150 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268830#comment-13268830 ]

Ryan McKinley commented on LUCENE-2510:
---------------------------------------

bq. I don't see why that would hurt OSGI

I think OSGI gets upset when you have different .jar files have classes in the same package -- I don't really know or care though.

------

Shouldn't SolrResourceLoader depend on org.apache.lucene.analysis.util.ResourceLoader rather then the deprecated one?

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 4, 2012, 6:11 PM

Post #15 of 34 (149 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268841#comment-13268841 ]

Chris Male commented on LUCENE-2510:
------------------------------------

bq. Shouldn't SolrResourceLoader depend on org.apache.lucene.analysis.util.ResourceLoader rather then the deprecated one?

No, that would prevent SolrResourceLoader from being able to be used with classes that still use the deprecated ResourceLoader. This way an analysis Factory which relies on the deprecated ResourceLoader can be loaded into Solr 4 without error.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:29 AM

Post #16 of 34 (139 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269534#comment-13269534 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

I dont think back compat is really important?

No ones code is going to work without changes for 4.0 anyway...
I dont think we should add a bunch of dead weight classes for no reason.

If we remove these, then do we need a .factories package? Cant we just
put these couple classes in .util?

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:35 AM

Post #17 of 34 (138 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269537#comment-13269537 ]

Chris Male commented on LUCENE-2510:
------------------------------------

bq. No ones code is going to work without changes for 4.0 anyway...

Fair point

{quote}
If we remove these, then do we need a .factories package? Cant we just
put these couple classes in .util?
{quote}

Yeah I can make that change. I'll update the patch.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:35 AM

Post #18 of 34 (137 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269536#comment-13269536 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

I think the interfaces are also unnecessary. Why not something like:

* AnalysisFactory (has init(Args), luceneMatchVersion, etc etc)
** TokenizerFactory
** TokenFilterFactory
** CharFilterFactory



> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:37 AM

Post #19 of 34 (138 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269539#comment-13269539 ]

Chris Male commented on LUCENE-2510:
------------------------------------

{quote}
TokenizerFactory
TokenFilterFactory
CharFilterFactory
{quote}

As classes instead of interfaces?

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:39 AM

Post #20 of 34 (137 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269540#comment-13269540 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

Yes. practically, all of the impls 'extend UselessBaseXXXFactory' today anyway :)

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:41 AM

Post #21 of 34 (138 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269541#comment-13269541 ]

Chris Male commented on LUCENE-2510:
------------------------------------

Okay, good idea.

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 4:59 AM

Post #22 of 34 (142 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269551#comment-13269551 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

re: what is the purpose of the newInstance method?

{quote}
If you take a look at org.apache.solr.analysis.DelimitedPayloadTokenFilterFactory you'll see an example of how it's used.

Looking at the implementation in SolrResourceLoader, it seems to facilitate two things:

The use of simplified solr.* package names
In FSTSynonymFilterFactory for example, newInstance is used to load other components. Consequently bq. bq. SolrResourceLoader adds the instantiated classes to its tracking of SolrCoreAware, ResourceLoaderAware, bq. etc.
With all that said, its only used in 3 Factories (but a lot of other Solr code). Perhaps we can break it bq. out somehow.
{quote}

I think we should revisit this. I don't like placing this into the analyzers module when not many factories actually use it, instead a lot of unrelated code in solr actually uses it. I think this could cause a mess.

On the other hand, both the things this provides can be achieved in other ways. For example, if we use NamedSPILoader instead to allow components such as factories to be found by name, then we can support "solr.WhitespaceTokenizerFactory" because TokenizerFactory.forName("WhitespaceTokenizerFactory") works. Using the SPI mechanism would allow for us to have completely pluggable analysis modules, also operations like listAll() work in case you want to enumerate a list (imagine someone that doesnt want a xml configuration but configured by a GUI or something like that instead). We also keep sane packaging within the analysis modules and keep type safety, and solr still keeps its solr.XXX syntax without reflecting a zillion packages or other crazy things.





> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 6:24 AM

Post #23 of 34 (138 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269598#comment-13269598 ]

Chris Male commented on LUCENE-2510:
------------------------------------

{quote}
I think we should revisit this. I don't like placing this into the analyzers module when not many factories actually use it, instead a lot of unrelated code in solr actually uses it. I think this could cause a mess.
{quote}

I agree. It feels messy where it is currently.

{quote}
For example, if we use NamedSPILoader instead to allow components such as factories to be found by name, then we can support "solr.WhitespaceTokenizerFactory" because TokenizerFactory.forName("WhitespaceTokenizerFactory") works.
{quote}

I don't really know much about NamedSPILoader but I think what you're suggesting. How would we support Factories loading unrelated classes like they can through ResourceLoader now? Assume they're on the classpath and use Class.forName?

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 6:36 AM

Post #24 of 34 (143 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269610#comment-13269610 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

{quote}
I don't really know much about NamedSPILoader but I think what you're suggesting. How would we support Factories loading unrelated classes like they can through ResourceLoader now? Assume they're on the classpath and use Class.forName?
{quote}

It needs more discussion (and input from Uwe would help!), but it works like Charset.forName("ASCII") etc. We use this already for codecs and postingsformats (Codec.forName, Codec.listAllCodecs, ...).

Have a look at lucene/core/src/resources/META-INF/services for the idea. Basically you "register" your classes in
your jar file this way: additional jar files (e.g. look at lucene/test-framework/src/resources/META-INF) can load more classes.

So this could support some idea like TokenizerFactory.forName("Whitespace") or something simple like that. So someone would not need to use org.apache.solr.analysis.xxx namespace to be able to load their analyzer stuff easily, they use whatever package they want and register in their META_INF. And added jar files (other analysis jars), are automatically available this way.

I think Uwe mentioned this idea before, though I think he had Analyzers in mind (e.g. provide language code and get back analyzer or something). Anyway thats for another issue :)

Just something worth consideration if we want to make these modules really pluggable. On the other hand we shouldn't use anything overkill if its not the right fit...

> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene


jira at apache

May 7, 2012, 7:12 AM

Post #25 of 34 (145 views)
Permalink
[jira] [Commented] (LUCENE-2510) migrate solr analysis factories to analyzers module [In reply to]

[ https://issues.apache.org/jira/browse/LUCENE-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269636#comment-13269636 ]

Robert Muir commented on LUCENE-2510:
-------------------------------------

{quote}
How would we support Factories loading unrelated classes like they can through ResourceLoader now? Assume they're on the classpath and use Class.forName?
{quote}

I think there are only a few situations of this? Like your payload example? If PayloadEncoder really needs to be
pluggable by class then you always also put it under SPI too (PayloadEncoder.forName).

In general if we decide on the SPI approach, I think it would be useful to think of improving the solr config too,
because the current configuration is so verbose and redundant.
e.g. for backwards compat we could support:

{noformat}
<charFilter class="solr.HtmlStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
{noformat}

but going forward this would be cleaner IMO, just use the SPI name directly:

{noformat}
<charFilter name="HtmlStrip"/>
<tokenizer name="Standard"/>
<filter name="LowerCase"/>
<filter name="PorterStem"/>
{noformat}


> migrate solr analysis factories to analyzers module
> ---------------------------------------------------
>
> Key: LUCENE-2510
> URL: https://issues.apache.org/jira/browse/LUCENE-2510
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Affects Versions: 4.0
> Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-parent-classes.patch, LUCENE-2510-resourceloader-bw.patch, LUCENE-2510.patch, LUCENE-2510.patch, LUCENE-2510.patch
>
>
> In LUCENE-2413 all TokenStreams were consolidated into the analyzers module.
> This is a good step, but I think the next step is to put the Solr factories into the analyzers module, too.
> This would make analyzers artifacts plugins to both lucene and solr, with benefits such as:
> * users could use the old analyzers module with solr, too. This is a good step to use real library versions instead of Version for backwards compat.
> * analyzers modules such as smartcn and icu, that aren't currently available to solr users due to large file sizes or dependencies, would be simple optional plugins to solr and easily available to users that want them.
> Rough sketch in this thread: http://www.lucidimagination.com/search/document/3465a0e55ba94d58/solr_and_analyzers_module
> Practically, I havent looked much and don't really have a plan for how this will work yet, so ideas are very welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe [at] lucene
For additional commands, e-mail: dev-help [at] lucene

First page Previous page 1 2 Next page Last page  View All Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.