cdoronc at gmail
Dec 28, 2007, 5:20 AM
Post #3 of 9
Re: SinkTokenizer: next(Token) vs. next()
[In reply to]
"safer" was not the best wording, sorry for that - I meant performance
wise, there's no correctness issue.
The "contract" of the two next methods as I understand it is that
a TS must implement one of them. I see no harm in implementing
the two (but doing so is likely to just duplicate TokenStream's code.)
For SinkTokenizer it actually implements next with no reuse logic,
so it really should implement just next(). Then, if any consumer
of SinkTokenizer calls next(Token), the default impl of this method
in TokenStream would call SinkTokenizers' next().
Do you agree with this?
On Dec 27, 2007 4:20 PM, Grant Ingersoll <gsingers [at] apache> wrote:
> On Dec 26, 2007, at 6:20 PM, Doron Cohen wrote:
> > Working on Lucene-1101 I checked if SinkTokenizer.next(Token) should
> > also
> > call Token.clear(). (It shouldn't, because it ignores the input
> > token.)
> > However I think that calls to next() would end up creating Tokens for
> > nothing (by TokenStream.next()).
> > May currently be an empty case (if all current uses call
> > next(Token)), but
> > still - is it safer for SinkTokenizer to implement next() rather than
> > next(Token)?
> I'm still a bit fuzzy on the interplay of these myself, but what makes
> the call of SinkTokenizer.next(Token) unsafe or is it just the
> potential of Tokens being created? I guess SinkTokenizer could just
> override both methods.