Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Index optimization ...

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


dragon-fly999 at hotmail

Jul 28, 2008, 11:00 AM

Post #1 of 14 (351 views)
Permalink
Index optimization ...

I'd like to shorten the time it takes to optimize my index and am willing to sacrifice search and indexing performance. Which parameters (e.g. merge factor) should I change? Thank you.

_________________________________________________________________
Stay in touch when you're away with Windows Live Messenger.
http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_messenger2_072008


jgriffin at thebluezone

Jul 28, 2008, 7:26 PM

Post #2 of 14 (332 views)
Permalink
RE: Index optimization ... [In reply to]

Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
sacrifice anything. It defaults to 16.0 MB so depending on the size of your
index you may want to make it larger. Do some testing at various values to
see where the sweet spot is.

John G.

-----Original Message-----
From: Dragon Fly [mailto:dragon-fly999[at]hotmail.com]
Sent: Monday, July 28, 2008 12:00 PM
To: java-user[at]lucene.apache.org
Subject: Index optimization ...

I'd like to shorten the time it takes to optimize my index and am willing to
sacrifice search and indexing performance. Which parameters (e.g. merge
factor) should I change? Thank you.

_________________________________________________________________
Stay in touch when you're away with Windows Live Messenger.
http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_messeng
er2_072008


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org


asbjorn at fellinghaug

Jul 28, 2008, 11:32 PM

Post #3 of 14 (327 views)
Permalink
Re: Index optimization ... [In reply to]

John Griffin:
> Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
> sacrifice anything. It defaults to 16.0 MB so depending on the size of your
> index you may want to make it larger. Do some testing at various values to
> see where the sweet spot is.
>

Also, have a look at
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
a range of helping advices in terms of enhanced indexing speed.

--
Asbjørn A. Fellinghaug
asbjorn[at]fellinghaug.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org


shalinmangar at gmail

Jul 29, 2008, 2:04 AM

Post #4 of 14 (328 views)
Permalink
Re: Index optimization ... [In reply to]

Try IndexWriter.optimize(int maxNumSegments)

On Mon, Jul 28, 2008 at 11:30 PM, Dragon Fly <dragon-fly999[at]hotmail.com>wrote:

> I'd like to shorten the time it takes to optimize my index and am willing
> to sacrifice search and indexing performance. Which parameters (e.g. merge
> factor) should I change? Thank you.
>
> _________________________________________________________________
> Stay in touch when you're away with Windows Live Messenger.
>
> http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_messenger2_072008




--
Regards,
Shalin Shekhar Mangar.


dragon-fly999 at hotmail

Jul 30, 2008, 6:46 AM

Post #5 of 14 (314 views)
Permalink
RE: Index optimization ... [In reply to]

Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.

> Date: Tue, 29 Jul 2008 08:32:46 +0200
> From: asbjorn[at]fellinghaug.com
> To: java-user[at]lucene.apache.org
> Subject: Re: Index optimization ...
>
> John Griffin:
> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
> > index you may want to make it larger. Do some testing at various values to
> > see where the sweet spot is.
> >
>
> Also, have a look at
> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
> a range of helping advices in terms of enhanced indexing speed.
>
> --
> Asbjørn A. Fellinghaug
> asbjorn[at]fellinghaug.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>

_________________________________________________________________
With Windows Live for mobile, your contacts travel with you.
http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008


ian.lea at gmail

Jul 30, 2008, 6:54 AM

Post #6 of 14 (314 views)
Permalink
Re: Index optimization ... [In reply to]

Why do you run an optimize every 4 hours?


--
Ian.


On Wed, Jul 30, 2008 at 2:46 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.
>
>> Date: Tue, 29 Jul 2008 08:32:46 +0200
>> From: asbjorn[at]fellinghaug.com
>> To: java-user[at]lucene.apache.org
>> Subject: Re: Index optimization ...
>>
>> John Griffin:
>> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
>> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
>> > index you may want to make it larger. Do some testing at various values to
>> > see where the sweet spot is.
>> >
>>
>> Also, have a look at
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
>> a range of helping advices in terms of enhanced indexing speed.
>>
>> --
>> Asbjørn A. Fellinghaug
>> asbjorn[at]fellinghaug.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>>
>
> _________________________________________________________________
> With Windows Live for mobile, your contacts travel with you.
> http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org


dragon-fly999 at hotmail

Jul 30, 2008, 7:00 AM

Post #7 of 14 (313 views)
Permalink
RE: Index optimization ... [In reply to]

I have two copies (active/inactive) of the index. Searches are executed against the "active" index and new documents get added to the "inactive" copy. The two indexes get swapped every 4 hours (so that new documents are visible to the end user). Optimization is done before the inactive copy is made active.

> Date: Wed, 30 Jul 2008 14:54:03 +0100
> From: ian.lea[at]gmail.com
> To: java-user[at]lucene.apache.org
> Subject: Re: Index optimization ...
>
> Why do you run an optimize every 4 hours?
>
>
> --
> Ian.
>
>
> On Wed, Jul 30, 2008 at 2:46 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> > Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.
> >
> >> Date: Tue, 29 Jul 2008 08:32:46 +0200
> >> From: asbjorn[at]fellinghaug.com
> >> To: java-user[at]lucene.apache.org
> >> Subject: Re: Index optimization ...
> >>
> >> John Griffin:
> >> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
> >> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
> >> > index you may want to make it larger. Do some testing at various values to
> >> > see where the sweet spot is.
> >> >
> >>
> >> Also, have a look at
> >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
> >> a range of helping advices in terms of enhanced indexing speed.
> >>
> >> --
> >> Asbjørn A. Fellinghaug
> >> asbjorn[at]fellinghaug.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
> >>
> >
> > _________________________________________________________________
> > With Windows Live for mobile, your contacts travel with you.
> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>

_________________________________________________________________
With Windows Live for mobile, your contacts travel with you.
http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008


ian.lea at gmail

Jul 30, 2008, 7:03 AM

Post #8 of 14 (313 views)
Permalink
Re: Index optimization ... [In reply to]

OK, but why do you need to optimize before every swap? Have you tried
with less frequent optimizes?

--
Ian.


On Wed, Jul 30, 2008 at 3:00 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> I have two copies (active/inactive) of the index. Searches are executed against the "active" index and new documents get added to the "inactive" copy. The two indexes get swapped every 4 hours (so that new documents are visible to the end user). Optimization is done before the inactive copy is made active.
>
>> Date: Wed, 30 Jul 2008 14:54:03 +0100
>> From: ian.lea[at]gmail.com
>> To: java-user[at]lucene.apache.org
>> Subject: Re: Index optimization ...
>>
>> Why do you run an optimize every 4 hours?
>>
>>
>> --
>> Ian.
>>
>>
>> On Wed, Jul 30, 2008 at 2:46 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
>> > Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.
>> >
>> >> Date: Tue, 29 Jul 2008 08:32:46 +0200
>> >> From: asbjorn[at]fellinghaug.com
>> >> To: java-user[at]lucene.apache.org
>> >> Subject: Re: Index optimization ...
>> >>
>> >> John Griffin:
>> >> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
>> >> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
>> >> > index you may want to make it larger. Do some testing at various values to
>> >> > see where the sweet spot is.
>> >> >
>> >>
>> >> Also, have a look at
>> >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
>> >> a range of helping advices in terms of enhanced indexing speed.
>> >>
>> >> --
>> >> Asbjørn A. Fellinghaug
>> >> asbjorn[at]fellinghaug.com
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>> >>
>> >
>> > _________________________________________________________________
>> > With Windows Live for mobile, your contacts travel with you.
>> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>>
>
> _________________________________________________________________
> With Windows Live for mobile, your contacts travel with you.
> http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org


dragon-fly999 at hotmail

Jul 30, 2008, 7:15 AM

Post #9 of 14 (313 views)
Permalink
RE: Index optimization ... [In reply to]

My understanding is that an optimized index gives the best search performance. I can change my configuration to optimize the index every 24 hours. However, I still would like to know if there is a way to speed up optimization by tweaking parameters like the merge factor.

> Date: Wed, 30 Jul 2008 15:03:37 +0100
> From: ian.lea[at]gmail.com
> To: java-user[at]lucene.apache.org
> Subject: Re: Index optimization ...
>
> OK, but why do you need to optimize before every swap? Have you tried
> with less frequent optimizes?
>
> --
> Ian.
>
>
> On Wed, Jul 30, 2008 at 3:00 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> > I have two copies (active/inactive) of the index. Searches are executed against the "active" index and new documents get added to the "inactive" copy. The two indexes get swapped every 4 hours (so that new documents are visible to the end user). Optimization is done before the inactive copy is made active.
> >
> >> Date: Wed, 30 Jul 2008 14:54:03 +0100
> >> From: ian.lea[at]gmail.com
> >> To: java-user[at]lucene.apache.org
> >> Subject: Re: Index optimization ...
> >>
> >> Why do you run an optimize every 4 hours?
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Wed, Jul 30, 2008 at 2:46 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> >> > Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.
> >> >
> >> >> Date: Tue, 29 Jul 2008 08:32:46 +0200
> >> >> From: asbjorn[at]fellinghaug.com
> >> >> To: java-user[at]lucene.apache.org
> >> >> Subject: Re: Index optimization ...
> >> >>
> >> >> John Griffin:
> >> >> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
> >> >> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
> >> >> > index you may want to make it larger. Do some testing at various values to
> >> >> > see where the sweet spot is.
> >> >> >
> >> >>
> >> >> Also, have a look at
> >> >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
> >> >> a range of helping advices in terms of enhanced indexing speed.
> >> >>
> >> >> --
> >> >> Asbjørn A. Fellinghaug
> >> >> asbjorn[at]fellinghaug.com
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
> >> >>
> >> >
> >> > _________________________________________________________________
> >> > With Windows Live for mobile, your contacts travel with you.
> >> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
> >>
> >
> > _________________________________________________________________
> > With Windows Live for mobile, your contacts travel with you.
> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>

_________________________________________________________________
With Windows Live for mobile, your contacts travel with you.
http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008


helloanand at gmail

Jul 30, 2008, 7:23 AM

Post #10 of 14 (313 views)
Permalink
Re: Index optimization ... [In reply to]

As an aside, I would like to understand how do you get away without adding documents to the active index. As far as I understand, you are only adding docs to the inactive index and swap it with the active index (so the active one becomes inactive and vice-versa). So do you bring the "new" inactive upto speed to compensate for the documents it missed while the "old" Inactive index got upated?

Just curious,

Anand
-----Original Message-----
From: Dragon Fly <dragon-fly999[at]hotmail.com>

Date: Wed, 30 Jul 2008 10:00:25
To: <java-user[at]lucene.apache.org>
Subject: RE: Index optimization ...


I have two copies (active/inactive) of the index. Searches are executed against the "active" index and new documents get added to the "inactive" copy. The two indexes get swapped every 4 hours (so that new documents are visible to the end user). Optimization is done before the inactive copy is made active.

> Date: Wed, 30 Jul 2008 14:54:03 +0100
> From: ian.lea[at]gmail.com
> To: java-user[at]lucene.apache.org
> Subject: Re: Index optimization ...
>
> Why do you run an optimize every 4 hours?
>
>
> --
> Ian.
>
>
> On Wed, Jul 30, 2008 at 2:46 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> > Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.
> >
> >> Date: Tue, 29 Jul 2008 08:32:46 +0200
> >> From: asbjorn[at]fellinghaug.com
> >> To: java-user[at]lucene.apache.org
> >> Subject: Re: Index optimization ...
> >>
> >> John Griffin:
> >> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
> >> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
> >> > index you may want to make it larger. Do some testing at various values to
> >> > see where the sweet spot is.
> >> >
> >>
> >> Also, have a look at
> >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
> >> a range of helping advices in terms of enhanced indexing speed.
> >>
> >> --
> >> Asbjørn A. Fellinghaug
> >> asbjorn[at]fellinghaug.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
> >>
> >
> >_________________________________________________________________
> > With Windows Live for mobile, your contacts travel with you.
> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>

_________________________________________________________________
With Windows Live for mobile, your contacts travel with you.
http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008


ian.lea at gmail

Jul 30, 2008, 7:33 AM

Post #11 of 14 (313 views)
Permalink
Re: Index optimization ... [In reply to]

If I was you I'd certainly try cutting the optimize frequency. An
optimized index should indeed give the best search performance, but in
my experience it's generally plenty fast enough anyway, and I think
you said earlier that you were prepared to sacrifice a bit of search
or indexing speed.

Sorry, can't help with the questions on speeding it up except to
advise that you play with it. Different indexes on different hardware
with different combinations of CPU/memory/disk can give different
results.

--
Ian.


On Wed, Jul 30, 2008 at 3:15 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
> My understanding is that an optimized index gives the best search performance. I can change my configuration to optimize the index every 24 hours. However, I still would like to know if there is a way to speed up optimization by tweaking parameters like the merge factor.
>
>> Date: Wed, 30 Jul 2008 15:03:37 +0100
>> From: ian.lea[at]gmail.com
>> To: java-user[at]lucene.apache.org
>> Subject: Re: Index optimization ...
>>
>> OK, but why do you need to optimize before every swap? Have you tried
>> with less frequent optimizes?
>>
>> --
>> Ian.
>>
>>
>> On Wed, Jul 30, 2008 at 3:00 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
>> > I have two copies (active/inactive) of the index. Searches are executed against the "active" index and new documents get added to the "inactive" copy. The two indexes get swapped every 4 hours (so that new documents are visible to the end user). Optimization is done before the inactive copy is made active.
>> >
>> >> Date: Wed, 30 Jul 2008 14:54:03 +0100
>> >> From: ian.lea[at]gmail.com
>> >> To: java-user[at]lucene.apache.org
>> >> Subject: Re: Index optimization ...
>> >>
>> >> Why do you run an optimize every 4 hours?
>> >>
>> >>
>> >> --
>> >> Ian.
>> >>
>> >>
>> >> On Wed, Jul 30, 2008 at 2:46 PM, Dragon Fly <dragon-fly999[at]hotmail.com> wrote:
>> >> > Perhaps I didn't explain myself clearly so please let me try it again. I'm happy with the search/indexing performance. However, my index gets fully optimized every 4 hours and the time it takes to fully optimize the index is longer than I like. Is there anything that I can do to speed up the optimization? I don't fully understand the different parameters (e.g. merge factor). If I decrease the merge factor, would it make the indexing slower (which I'm OK with) but the optimization faster? Thank you.
>> >> >
>> >> >> Date: Tue, 29 Jul 2008 08:32:46 +0200
>> >> >> From: asbjorn[at]fellinghaug.com
>> >> >> To: java-user[at]lucene.apache.org
>> >> >> Subject: Re: Index optimization ...
>> >> >>
>> >> >> John Griffin:
>> >> >> > Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
>> >> >> > sacrifice anything. It defaults to 16.0 MB so depending on the size of your
>> >> >> > index you may want to make it larger. Do some testing at various values to
>> >> >> > see where the sweet spot is.
>> >> >> >
>> >> >>
>> >> >> Also, have a look at
>> >> >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which provides
>> >> >> a range of helping advices in terms of enhanced indexing speed.
>> >> >>
>> >> >> --
>> >> >> Asbjørn A. Fellinghaug
>> >> >> asbjorn[at]fellinghaug.com
>> >> >>
>> >> >> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> >> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>> >> >>
>> >> >
>> >> > _________________________________________________________________
>> >> > With Windows Live for mobile, your contacts travel with you.
>> >> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>> >>
>> >
>> > _________________________________________________________________
>> > With Windows Live for mobile, your contacts travel with you.
>> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>>
>
> _________________________________________________________________
> With Windows Live for mobile, your contacts travel with you.
> http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org


gsingers at apache

Jul 30, 2008, 8:12 AM

Post #12 of 14 (314 views)
Permalink
Re: Index optimization ... [In reply to]

What version of Lucene are you using? What is your current
mergeFactor? Lowering this (minimum is 2) will result in an index
that is closer to "optimal" since an optimized index is just one that
has all the segments merged into a single segment and a mergeFactor of
2 just means there are only ever 2 segments (the docs could make this
clearer). The tradeoff is that you may need to do merges more often.
If you are using Lucene 2.3.x these merges can now take place in the
background, so this may not be as big a penalty as it once was.
Still, the fact is, optimize has to go through, in the end, and merge
your segments into one big segment. This is a lengthy undertaking on
a large index.

I'm not sure, however, if any of this will reduce your overall time.
I suppose it depends somewhat on your update rate. It is possible
that the slower indexing could offset any gains had by a faster
optimize. Another option may be to keep the mergeFactor higher, but
then every so often do partial optimizes to keep your index closer to
optimal such that a final optimize may be speed up

Another question is have you measured your query performance on an
unoptimized index? Is it acceptable? Are you only adding new
documents or are you also deleting docs in that 4 hour time period?

Bottom line, though, is you need to test out the various knobs
(mergeFactor, RAMBufferSizeMB, etc.) and see. You may find the
contrib/benchmark program helpful for running experiments, although it
isn't a substitute for your actual data.

-Grant


On Jul 30, 2008, at 9:46 AM, Dragon Fly wrote:

> Perhaps I didn't explain myself clearly so please let me try it
> again. I'm happy with the search/indexing performance. However,
> my index gets fully optimized every 4 hours and the time it takes to
> fully optimize the index is longer than I like. Is there anything
> that I can do to speed up the optimization? I don't fully understand
> the different parameters (e.g. merge factor). If I decrease the
> merge factor, would it make the indexing slower (which I'm OK with)
> but the optimization faster? Thank you.
>
>> Date: Tue, 29 Jul 2008 08:32:46 +0200
>> From: asbjorn[at]fellinghaug.com
>> To: java-user[at]lucene.apache.org
>> Subject: Re: Index optimization ...
>>
>> John Griffin:
>>> Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
>>> sacrifice anything. It defaults to 16.0 MB so depending on the
>>> size of your
>>> index you may want to make it larger. Do some testing at various
>>> values to
>>> see where the sweet spot is.
>>>
>>
>> Also, have a look at
>> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which
>> provides
>> a range of helping advices in terms of enhanced indexing speed.
>>
>> --
>> Asbjørn A. Fellinghaug
>> asbjorn[at]fellinghaug.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
>> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>>
>
> _________________________________________________________________
> With Windows Live for mobile, your contacts travel with you.
> http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org


dragon-fly999 at hotmail

Jul 30, 2008, 2:45 PM

Post #13 of 14 (293 views)
Permalink
RE: Index optimization ... [In reply to]

I'll run some tests. Thank you.

> From: gsingers[at]apache.org
> To: java-user[at]lucene.apache.org
> Subject: Re: Index optimization ...
> Date: Wed, 30 Jul 2008 11:12:28 -0400
>
> What version of Lucene are you using? What is your current
> mergeFactor? Lowering this (minimum is 2) will result in an index
> that is closer to "optimal" since an optimized index is just one that
> has all the segments merged into a single segment and a mergeFactor of
> 2 just means there are only ever 2 segments (the docs could make this
> clearer). The tradeoff is that you may need to do merges more often.
> If you are using Lucene 2.3.x these merges can now take place in the
> background, so this may not be as big a penalty as it once was.
> Still, the fact is, optimize has to go through, in the end, and merge
> your segments into one big segment. This is a lengthy undertaking on
> a large index.
>
> I'm not sure, however, if any of this will reduce your overall time.
> I suppose it depends somewhat on your update rate. It is possible
> that the slower indexing could offset any gains had by a faster
> optimize. Another option may be to keep the mergeFactor higher, but
> then every so often do partial optimizes to keep your index closer to
> optimal such that a final optimize may be speed up
>
> Another question is have you measured your query performance on an
> unoptimized index? Is it acceptable? Are you only adding new
> documents or are you also deleting docs in that 4 hour time period?
>
> Bottom line, though, is you need to test out the various knobs
> (mergeFactor, RAMBufferSizeMB, etc.) and see. You may find the
> contrib/benchmark program helpful for running experiments, although it
> isn't a substitute for your actual data.
>
> -Grant
>
>
> On Jul 30, 2008, at 9:46 AM, Dragon Fly wrote:
>
> > Perhaps I didn't explain myself clearly so please let me try it
> > again. I'm happy with the search/indexing performance. However,
> > my index gets fully optimized every 4 hours and the time it takes to
> > fully optimize the index is longer than I like. Is there anything
> > that I can do to speed up the optimization? I don't fully understand
> > the different parameters (e.g. merge factor). If I decrease the
> > merge factor, would it make the indexing slower (which I'm OK with)
> > but the optimization faster? Thank you.
> >
> >> Date: Tue, 29 Jul 2008 08:32:46 +0200
> >> From: asbjorn[at]fellinghaug.com
> >> To: java-user[at]lucene.apache.org
> >> Subject: Re: Index optimization ...
> >>
> >> John Griffin:
> >>> Use IndexWriter.setRAMBufferSizeMB(double mb) and you won't have to
> >>> sacrifice anything. It defaults to 16.0 MB so depending on the
> >>> size of your
> >>> index you may want to make it larger. Do some testing at various
> >>> values to
> >>> see where the sweet spot is.
> >>>
> >>
> >> Also, have a look at
> >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed, which
> >> provides
> >> a range of helping advices in terms of enhanced indexing speed.
> >>
> >> --
> >> Asbjørn A. Fellinghaug
> >> asbjorn[at]fellinghaug.com
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> >> For additional commands, e-mail: java-user-help[at]lucene.apache.org
> >>
> >
> > _________________________________________________________________
> > With Windows Live for mobile, your contacts travel with you.
> > http://www.windowslive.com/mobile/overview.html?ocid=TXT_TAGLM_WL_mobile_072008
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
> For additional commands, e-mail: java-user-help[at]lucene.apache.org
>

_________________________________________________________________
Use video conversation to talk face-to-face with Windows Live Messenger.
http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_video_072008


hossman_lucene at fucit

Jul 30, 2008, 3:44 PM

Post #14 of 14 (294 views)
Permalink
RE: Index optimization ... [In reply to]

: My understanding is that an optimized index gives the best search

there is an inherent inconsistency in your question -- yo usay you
optimize your index before using it becuase you heard thta makes searches
faster, but in your orriginal question you said...

> I'd like to shorten the time it takes to optimize my index and am
> willing to sacrifice search and indexing performance. Which parameters

...if you are willing to sacrifice search performacne, why don't you just
stop optimizing altogether?

there's very little point to optimizing unless you don't pln on
updating the index for a while afterwards, and are trying to squeeze
every little bit of speed out of each search. if the time it takes to do
the optimize is prohibitive, and you don't care if searches get a *little*
slower (and i really do suspect the difference will be fairly small, some
really great improvements were made to searching with multiple segments a
few point revs back) then you don't need it.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.