Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Problems with fragments size on highlight.

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


felipe at jusbrasil

Nov 14, 2009, 5:21 PM

Post #1 of 2 (605 views)
Permalink
Problems with fragments size on highlight.

Hi, i'm having some problems with the size of the fragmentes when i'm doing
the highlight. I pass on the constructor of the SimpleSpanFragmenter a size
of 80, but i'm having some fragments of 500.
But it only happen when i do searches with 3 or more words.
What i notice is that he tries to put the highlights in the same fragments,
but makes it too big.

code:
highlighter.setTextFragmenter(new
SimpleSpanFragmenter(scorer,FRAGMENT_SIZE));//FRAGMENT_SIZE=80

fieldName = "Body";
text =
CompressionTools.decompressString(result.getField(fieldName).getBinaryValue());

tokenStream = analyzer.tokenStream(fieldName, new
StringReader(text));

markedResult = null;

try{
markedResult = highlighter.getBestFragments(tokenStream, text,
NUM_FRAGMENTS, TOKEN_DELIMITER).trim();// NUM_FRAGMENTS=3 and
TOKEN_DELIMITER="..."
}
catch (Exception ex){
logger.error(ex.toString());
}

i'm using brazilian analyser with lucene core 2.9.1 and highlighter and
memory 2.9.0 .

thanks.

--
Felipe Lobo
www.jusbrasil.com.br


markharw00d at yahoo

Nov 18, 2009, 10:31 AM

Post #2 of 2 (530 views)
Permalink
Re: Problems with fragments size on highlight. [In reply to]

It could be the "merge contiguous fragments" feature that attempts to
do exactly this to improve readability

It's an option you can turn off.

On 15 Nov 2009, at 01:21, Felipe Lobo wrote:

> Hi, i'm having some problems with the size of the fragmentes when
> i'm doing
> the highlight. I pass on the constructor of the SimpleSpanFragmenter
> a size
> of 80, but i'm having some fragments of 500.
> But it only happen when i do searches with 3 or more words.
> What i notice is that he tries to put the highlights in the same
> fragments,
> but makes it too big.
>
> code:
> highlighter.setTextFragmenter(new
> SimpleSpanFragmenter(scorer,FRAGMENT_SIZE));//FRAGMENT_SIZE=80
>
> fieldName = "Body";
> text =
> CompressionTools.decompressString(result.getField
> (fieldName).getBinaryValue());
>
> tokenStream = analyzer.tokenStream(fieldName, new
> StringReader(text));
>
> markedResult = null;
>
> try{
> markedResult = highlighter.getBestFragments(tokenStream,
> text,
> NUM_FRAGMENTS, TOKEN_DELIMITER).trim();// NUM_FRAGMENTS=3 and
> TOKEN_DELIMITER="..."
> }
> catch (Exception ex){
> logger.error(ex.toString());
> }
>
> i'm using brazilian analyser with lucene core 2.9.1 and highlighter
> and
> memory 2.9.0 .
>
> thanks.
>
> --
> Felipe Lobo
> www.jusbrasil.com.br


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.