Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Adding segments to an optimized index

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


marc.sturlese at gmail

Oct 28, 2009, 4:36 AM

Post #1 of 2 (297 views)
Permalink
Adding segments to an optimized index

I am doing some test with optimize and adding segments and I am wondering if
someone knows if what I am doing can give document inconsistency.
I have 2 folders with one index each. One have a non optimized index1 with 1
milion docs and a mergeFactor=10. The other one, index2 has the same index
optimized with compound file. I add and delete some docuements in the no
optimized index1. And a few segements desapear and somew are created. I now
I copy the new created files in the optimized index2 and optimized it again.
I get no errors doing that but... docuemenst will be the same in index1 and
index2? I am asking because when I added some docs and delete others in
index1 some segments desapear and index2 is suposed to still have that
segements optimized with the others... or it doesn't work this way?

What I try to explain is:

index1:
seg1,seg2,seg3,seg4,seg5
index2: (index1 optimized with compound)
seg8

adding and deleteting docs to index1 will get:
seg1,seg2,seg3,seg6 (seg4 and seg5 have desapeared and seg6 has been
created)
now I do in index2:
seg8+seg6+optimize=seg9 (but seg8 is suposed to still contain seg4 and seg5)

The question is: index1 (seg1,seg2,seg3,seg6) and index2(seg9) will contain
the same docs??

Thanks in advance and let me know if I wasn't clear in my explanation
please.
--
View this message in context: http://www.nabble.com/Adding-segments-to-an-optimized-index-tp26093125p26093125.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene


torindan at gmail

Oct 28, 2009, 5:50 AM

Post #2 of 2 (283 views)
Permalink
Re: Adding segments to an optimized index [In reply to]

There is no such thing in lucene as "unique" doc.

They might be unique from your application point of view (have some ID
that is unique)
From lucene's point of view it's perfectly fine to have duplicate documents.

So the "deleted" documents in combined index are coming from your second index.

Even more: if you search your combined index you'll see that there are
duplicate documents
that came from 1st index and were not deleted.

That's because lucene simply adds to combined index all documents that
aren't marked as deleted.
Remember that document is (kind of) opaque to lucene and it doesn't
have (and doesn't need)
any logic to handle such situations, these should be handled by your
application.

On Wed, Oct 28, 2009 at 13:36, Marc Sturlese <marc.sturlese [at] gmail> wrote:
>
> I am doing some test with optimize and adding segments and I am wondering if
> someone knows if what I am doing can give document inconsistency.
> I have 2 folders with one index each. One have a non optimized index1 with 1
> milion docs and a mergeFactor=10. The other one, index2 has the same index
> optimized with compound file. I add and delete some docuements in the no
> optimized index1. And a few segements desapear and somew are created. I now
> I copy the new created files in the optimized index2 and optimized it again.
> I get no errors doing that but... docuemenst will be the same in index1 and
> index2? I am asking because when I added some docs and delete others in
> index1 some segments desapear and index2 is suposed to still have that
> segements optimized with the others... or it doesn't work this way?
>
> What I try to explain is:
>
> index1:
> seg1,seg2,seg3,seg4,seg5
> index2: (index1 optimized with compound)
> seg8
>
> adding and deleteting docs to index1 will get:
> seg1,seg2,seg3,seg6 (seg4 and seg5 have desapeared and seg6 has been
> created)
> now I do in index2:
> seg8+seg6+optimize=seg9 (but seg8 is suposed to still contain seg4 and seg5)
>
> The question is: index1 (seg1,seg2,seg3,seg6) and index2(seg9) will contain
> the same docs??
>
> Thanks in advance and let me know if I wasn't clear in my explanation
> please.
> --
> View this message in context: http://www.nabble.com/Adding-segments-to-an-optimized-index-tp26093125p26093125.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
> For additional commands, e-mail: java-user-help [at] lucene
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.