
jason.rutherglen at gmail
Feb 4, 2010, 3:00 PM
Post #4 of 4
(783 views)
Permalink
|
|
Re: Analyzer for stripping non alpha-numeric characters?
[In reply to]
|
|
Answering my own question... PatternReplaceFilter doesn't output multiple tokens... Which means messing with capture state... On Thu, Feb 4, 2010 at 2:16 PM, Jason Rutherglen <jason.rutherglen [at] gmail> wrote: > Transferred partially to solr-user... > > Steven, thanks for the reply! > > I wonder if PatternReplaceFilter can output multiple tokens? I'd like > to progressively strip the non-alphanums, for example output: > > apple!&* > apple!& > apple! > apple > > On Thu, Feb 4, 2010 at 12:18 PM, Steven A Rowe <sarowe [at] syr> wrote: >> Hi Jason, >> >> Solr's PatternReplaceFilter(ts, "\\P{Alnum}+$", "", false) should work, chained after an appropriate tokenizer. >> >> Steve >> >> On 02/04/2010 at 12:18 PM, Jason Rutherglen wrote: >>> Is there an analyzer that easily strips non alpha-numeric from the end >>> of a token? >>> >>> --------------------------------------------------------------------- To >>> unsubscribe, e-mail: java-user-unsubscribe [at] lucene For >>> additional commands, e-mail: java-user-help [at] lucene >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene >> For additional commands, e-mail: java-user-help [at] lucene >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe [at] lucene For additional commands, e-mail: java-user-help [at] lucene
|