marvin at rectangular
Sep 2, 2008, 9:28 PM
On Sep 2, 2008, at 1:06 PM, Mike Barborak wrote:
> if I re-index a document (done in a different process and without
> spec_field being called),
Is there a reason not to call spec_field() when you re-index? Calling
it would likely solve the problem.
Calling spec_field() multiple times is fine (and recommended) so long
as the field definition is always the same.
> What I _think_ is happening is that when I create my index, the
> filter field is correctly not being analyzed but that when I do the
> re-index, it is being analyzed and this then is causing an issue.
Yes, it looks like that's right. The 'analyzed' flag is not stored
with the index. It's defaulting to a true value when the fields
metadata is read in (FieldInfos->read_infos). This wouldn't cause
significant problems for most people because once the data is in the
index, it doesn't get re-analyzed. (I can see an esoteric bug with
Searcher->_prepare_simple_search, but it wouldn't be easy to tickle.)
The workaround should be to call spec_field(). A fix for maint would
involve storing the 'analyzed' flag, which would be a little tricky
for back-compat reasons.
I know the devel branch is not an option for you, but for the record
and anyone who might be concerned, this problem would not affect devel
-- field definitions are determined by the FieldSpec class assigned to
the given field name in the Schema, and this load-from-disk-vs-call-
spec_field initialization conflict wouldn't happen.
> So I'm hoping this rings a bell with anyone in terms of something
> I'm doing wrong or what the issue might be. If not then I'll work on
> developing a concise test case to hopefully reproduce what I'm seeing.
Good detective work.
KinoSearch mailing list
KinoSearch [at] rectangular