renaud.delbru at deri
Oct 13, 2008, 8:38 AM
Post #3 of 5
Re: Modification of positional information encoding
[In reply to]
Michael McCandless wrote:
> This looks right, though you would also need to modify SegmentMerger
> to read & write your new format when merging segments.
> Another thing you could do is grep for "omitTf" which should touch
> exactly the same places you need to touch.
Ok, thanks for the pointers. I will examine this part of the Lucene code.
> It'd be awesome to get to the point where this read & write logic is
> captured in a single "codec" that's cleanly shared in all these places
> ("flexible indexing") but we are not quit there yet...
Yes, it will be really handy in order to experiment alternative inverted
index structures. At the moment, it requires quite some work, and
reverse engineering, in order to be able to modify the index structure.
>> Another question, since the lucene core classes are kind of close,
>> what is the best way to implement these modifications ? Make a branch
>> of lucene, and add my new classes to the lucene package
>> org.apache.lucene.index ? Or do a more elegant solution is possible ?
> For starters (to try things out) I would just make local modifications
> with a lucene source checkout (via svn).
> Also, this issue was just opened:
> which would make it possible for classes in the same package
> (oal.index) to use their own indexing chain. With that fix, if you
> make your own classes in oal.index package, and perhaps subclass the
> above classes, you could then create your own indexing chain for
> indexing? If you take that approach, please report back so we can
> learn how to improve Lucene for these very advanced customizations!
Ok, thanks for the reference. I will try this solution, and will report
you any problems I will encounter.
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene