Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-Dev

code improvement / easier optimization

 

 

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded


rengels at ix

Nov 2, 2007, 11:40 AM

Post #1 of 2 (857 views)
Permalink
code improvement / easier optimization

The Lucene 2.2 code for managing buffers is somewhat "ugly" - the
passing of the buffer size parameter around.

I changed this in my branch to use the BufferSizes class below.

I changed the BufferedIndexInput/Output class like this

class BufferedIndexInput {
private int bufferSize = BufferSizes.getReadBufferSize();
....
}
class BufferedIndexOutput {
private int bufferSize = BufferSizes.getWriteBufferSize();
....
}

then in IndexWriter I surround the code that creates the
SegmentReaders with:

try {
BufferSizes.useMergeBuffers();
... create segment readers ...
} finally {
BufferSizes.useNormalBuffers();
}

I think this is much cleaner. It also allows for other optimizations
like:

query engine detects a phrase query, so it increase the the buffers
prior to reading the terms
query result has a lot of matches, so increase the buffer size when
reading the documents

Seems a lot easier to manage. It also allows playing with various
buffer sizes very easily.

I have been able to get the optimize time down from 3.5 minutes to
1.5 minutes on the exact same index (using all of the recent
enhancements - much of the improvement is related to the larger
buffer sizes used in Lucene 2.2).


package org.apache.lucene.store;

public class BufferSizes {
private static ThreadLocal useMergeBuffers = new ThreadLocal(){};

public static int getReadBufferSize() {
return (Boolean.TRUE.equals(useMergeBuffers.get())) ? 16384*2 : 1024;
}
public static int getWriteBufferSize() {
return 16384*2;
}
/**
* cause the current thread to use buffers sized for segment
merging. always use try/finally to reset the value
*/
public static void useMergeBuffers() {
useMergeBuffers.set(Boolean.TRUE);
}
/**
* cause the current thread to use buffers sized for normal
index operations
*/
public static void useNormalBuffers() {
useMergeBuffers.set(Boolean.FALSE);
}
}


rengels at ix

Nov 2, 2007, 11:51 AM

Post #2 of 2 (765 views)
Permalink
Re: code improvement / easier optimization [In reply to]

To clarify, I changed the code in IndexWriter that creates the
SegmentReaders DURING MERGING to use...

On Nov 2, 2007, at 1:40 PM, robert engels wrote:

> The Lucene 2.2 code for managing buffers is somewhat "ugly" - the
> passing of the buffer size parameter around.
>
> I changed this in my branch to use the BufferSizes class below.
>
> I changed the BufferedIndexInput/Output class like this
>
> class BufferedIndexInput {
> private int bufferSize = BufferSizes.getReadBufferSize();
> ....
> }
> class BufferedIndexOutput {
> private int bufferSize = BufferSizes.getWriteBufferSize();
> ....
> }
>
> then in IndexWriter I surround the code that creates the
> SegmentReaders with:
>
> try {
> BufferSizes.useMergeBuffers();
> ... create segment readers ...
> } finally {
> BufferSizes.useNormalBuffers();
> }
>
> I think this is much cleaner. It also allows for other
> optimizations like:
>
> query engine detects a phrase query, so it increase the the buffers
> prior to reading the terms
> query result has a lot of matches, so increase the buffer size when
> reading the documents
>
> Seems a lot easier to manage. It also allows playing with various
> buffer sizes very easily.
>
> I have been able to get the optimize time down from 3.5 minutes to
> 1.5 minutes on the exact same index (using all of the recent
> enhancements - much of the improvement is related to the larger
> buffer sizes used in Lucene 2.2).
>
>
> package org.apache.lucene.store;
>
> public class BufferSizes {
> private static ThreadLocal useMergeBuffers = new ThreadLocal(){};
>
> public static int getReadBufferSize() {
> return (Boolean.TRUE.equals(useMergeBuffers.get())) ? 16384*2 : 1024;
> }
> public static int getWriteBufferSize() {
> return 16384*2;
> }
> /**
> * cause the current thread to use buffers sized for segment
> merging. always use try/finally to reset the value
> */
> public static void useMergeBuffers() {
> useMergeBuffers.set(Boolean.TRUE);
> }
> /**
> * cause the current thread to use buffers sized for normal
> index operations
> */
> public static void useNormalBuffers() {
> useMergeBuffers.set(Boolean.FALSE);
> }
> }
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe [at] lucene
For additional commands, e-mail: java-dev-help [at] lucene

Lucene java-dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.