Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

SimpleFragmenter docs

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


gsingers at apache

Jan 14, 2008, 2:05 PM

Post #1 of 3 (995 views)
Permalink
SimpleFragmenter docs

I was looking at the SimpleFragmenter in contrib/Highlighter and was
wondering about the fragmentSize value. It says the value is the
number of bytes, but looking at the code it's using the String offset,
right? So it should be the number of characters, right?

I can fix it, just wanted to confirm my understanding.

Thanks,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


markrmiller at gmail

Jan 14, 2008, 2:33 PM

Post #2 of 3 (938 views)
Permalink
Re: SimpleFragmenter docs [In reply to]

I think your right, and thats not the only place...the whole handling of
maxDocBytesToAnalyze in the main Highlighter class shares this issue. I
guess the idea is an ascii holdover one byte equals one char? I am sure
Mark H can clear it up, but don't forgot the maxDocBytesToAnalyze part
as well when its corrected.

- Mark

Grant Ingersoll wrote:
> I was looking at the SimpleFragmenter in contrib/Highlighter and was
> wondering about the fragmentSize value. It says the value is the
> number of bytes, but looking at the code it's using the String offset,
> right? So it should be the number of characters, right?
>
> I can fix it, just wanted to confirm my understanding.
>
> Thanks,
> Grant
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


mike.klaas at gmail

Jan 23, 2008, 11:37 AM

Post #3 of 3 (885 views)
Permalink
Re: SimpleFragmenter docs [In reply to]

Indeed--this is why the associated parameter is called
maxAnalyzedChars in Solr.

-Mike

On 14-Jan-08, at 2:33 PM, Mark Miller wrote:

> I think your right, and thats not the only place...the whole
> handling of maxDocBytesToAnalyze in the main Highlighter class
> shares this issue. I guess the idea is an ascii holdover one byte
> equals one char? I am sure Mark H can clear it up, but don't forgot
> the maxDocBytesToAnalyze part as well when its corrected.
>
> - Mark
>
> Grant Ingersoll wrote:
>> I was looking at the SimpleFragmenter in contrib/Highlighter and
>> was wondering about the fragmentSize value. It says the value is
>> the number of bytes, but looking at the code it's using the String
>> offset, right? So it should be the number of characters, right?
>>
>> I can fix it, just wanted to confirm my understanding.
>>
>> Thanks,
>> Grant
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
>> For additional commands, e-mail: java-user-help [at] lucene
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.