Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Punctuation in Whitespace Analyzer

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


ihasmax at gmail

Jul 3, 2009, 9:05 AM

Post #1 of 1 (207 views)
Permalink
Punctuation in Whitespace Analyzer

Hello,
I am having an issue with analyzers. Right now, when I do a search, I am
searching for a whole name. For example, if I have a document like this:

"This is the document text. John Smith is mentioned right here, he is in
the john. Smith is his last name. His full name is John Smith."

If I search this document for the phrase "John Smith" I want to get the hits
(I'm using highlighting) only for the full names without punctuation inside
of them. For example, I don't want "john. Smith" to be highlighted.
However, I DO want to get the hit for "John Smith." with a period or comma
allowed after the *last name* only.

What is the best analyzer to use for this? Or is there a different way to
approach this? Right now my whitespace analyzer won't match on the "John
Smith." case, but maybe I just throw in a few more queries to handle
punctuation at the end of the last name?

Thanks,
Max

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.