Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

Custom Analyzer Strategy?

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


leegee at gmail

Aug 1, 2012, 4:16 AM

Post #1 of 4 (700 views)
Permalink
Custom Analyzer Strategy?

Hi

New to Lucene development, though I have been an indexing user for some
years, I find a need to develop an analyzer that reads a bespoke-format
(binary) file. I was wondering:

* Are there tutorials on analyzer development, or (ideally) an example
custom simple analyzer?

* Is it possible to send the output of one analyzer to another, and if
so, is it possible to have that chain defined in the configuration of
Lucene (or Solr...), or would it need to be hard-coded?

Thank you very much
Lee


rcmuir at gmail

Aug 3, 2012, 5:56 AM

Post #2 of 4 (684 views)
Permalink
Re: Custom Analyzer Strategy? [In reply to]

On Wed, Aug 1, 2012 at 7:16 AM, Lee Goddard <leegee [at] gmail> wrote:
> Hi
>
> New to Lucene development, though I have been an indexing user for some
> years, I find a need to develop an analyzer that reads a bespoke-format
> (binary) file. I was wondering:

Hello: usually you would not process such a binary file with an
analyzer, you would parse the binary file into the Fields you care
about and then add them to your Document.

The analyzer is separate from that "parsing", its the way you specify
text preprocessing at both index and query time like lowercasing,
stemming, etc.

>
> * Are there tutorials on analyzer development, or (ideally) an example
> custom simple analyzer?

Start with http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/analysis/package-summary.html#package_description

>
> * Is it possible to send the output of one analyzer to another, and if so,
> is it possible to have that chain defined in the configuration of Lucene (or
> Solr...), or would it need to be hard-coded?

you can configure your analysis chain declaratively in Solr in a
configuration file.

--
lucidimagination.com


leegee at gmail

Aug 3, 2012, 6:37 AM

Post #3 of 4 (684 views)
Permalink
Re: Custom Analyzer Strategy? [In reply to]

On 03/08/2012 14:56, Robert Muir wrote:
> On Wed, Aug 1, 2012 at 7:16 AM, Lee Goddard <leegee [at] gmail> wrote:
>> New to Lucene development, though I have been an indexing user for some
>> years, I find a need to develop an analyzer that reads a bespoke-format
>> (binary) file. I was wondering:
> Hello: usually you would not process such a binary file with an
> analyzer, you would parse the binary file into the Fields you care
> about and then add them to your Document.
>
> The analyzer is separate from that "parsing", its the way you specify
> text preprocessing at both index and query time like lowercasing,
> stemming, etc.
>
>> * Are there tutorials on analyzer development, or (ideally) an example
>> custom simple analyzer?
> Start with http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/analysis/package-summary.html#package_description
>
>> * Is it possible to send the output of one analyzer to another, and if so,
>> is it possible to have that chain defined in the configuration of Lucene (or
>> Solr...), or would it need to be hard-coded?
> you can configure your analysis chain declaratively in Solr in a
> configuration file.

Thanks very much, Robert. And now I see the package summary JavaDoc you
pointed to, I feel quite silly.

Cheers
Lee


rcmuir at gmail

Aug 3, 2012, 7:29 AM

Post #4 of 4 (682 views)
Permalink
Re: Custom Analyzer Strategy? [In reply to]

On Fri, Aug 3, 2012 at 9:37 AM, Lee Goddard <leegee [at] gmail> wrote:
>
>
> Thanks very much, Robert. And now I see the package summary JavaDoc you
> pointed to, I feel quite silly.
>

Don't feel bad, these documents are somewhat buried. For the next
release, we are trying to make them more prominent in the
documentation by listing them under Getting Started/Reference
Documents sections: http://lucene.apache.org/core/4_0_0-ALPHA/

--
lucidimagination.com

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.