lucene at mikemccandless
Mar 9, 2012, 3:06 AM
Post #5 of 7
On Thu, Mar 8, 2012 at 7:22 AM, Grzegorz Tańczyk
Re: Re: BlockGroupingCollector, not always getting first document
[In reply to]
<grzegorz.tanczyk [at] polskastrefa> wrote:
> Thanks for reply, I can find first document from group using non grouping
OK, so the index seems ok.
> To be sure about this I deleted index and indexed only first 100 groups
> which gives around 2300 documents and I see the problem on at least half of
> groups. No problem in finding first documents normally.
> I noticed this problem first when I had indexed few thousands groups.
> When I index everything(15k groups, which means around 200k documents,
> commit every 500 groups) the problem is no more or at least I can't find any
> group with non first document in scoreDocs. I'm reindexing it since
> morning, I will reindex it once again to be sure about this one.
Weird that the full index doesn't show the issue but the partial index does.
> I'm not Lucene internals expert, but maybe this problem is somehow connected
> to segment merging?
Well, a simple way to test this is to use set NoMergePolicy on the
> Some additional info:
> I'm using Lucene 3.5.0.
> public final static Sort SORT_ID = new Sort(new SortField("id_n",
> Adding field to document:
> doc.add(new NumericField("id_n", Store.NO, true).setIntValue(rs.getInt(1)));
> (I checked how it works with Store.YES, it didn't change anything.)
> I also call searcher.setDefaultFieldSortScoring(true, true) before grouping
If you don't call this, is the issue still there?
> Calling optimize() also didn't help(but anyway I wouldn't use this method
> even if it was the solution for this problem ;-) )
OK. Did calling optimize() change which docs were missing...?
> Index writer config has default settings.
Are you doing any deleteDocuments or updateDocument calls?
> For now I'm using workaround, but I'm looking forward to finding solution of
> this problem.
Wait, what's the workaround?
I noticed you pass maxDocsPerGroup=1; if you increase that (eg to 10)
does it change the bug...?
Is it possible to boil this down to a small test case?
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene