
markharw00d at yahoo
Jan 4, 2008, 4:06 PM
Post #3 of 3
(1421 views)
Permalink
|
> > Let's say for the query algorithm, the word algorith is also a match, > how do the highlighter know that it should also highlight > occurrences of the word algorith? (I am not sure it does this anyway) The highlighter knows to highlight stemmed words because both the query terms and the document content are fed through (hopefully) the same analyzer so that "algorithmic", "algorithm", "algorithms" etc become stemmed to the same root form in both query and doc content. The tokens produced by analyzers include the byte offsets of the *original* full word, not just the stemmed form, so the highlighter knows the full extent of what to highlight in text. Cheers Mark --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe [at] lucene For additional commands, e-mail: java-user-help [at] lucene
|