Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
Field value vs TokenStream
 

Index | Next | Previous | View Flat


schnober at ids-mannheim

Apr 18, 2012, 8:00 AM


Views: 183
Permalink
Field value vs TokenStream

Dear list,
I'm studying the Lucene index file formats and I wonder: after having
initialized a field with Field(String name, String value, Field.Store
store, Field.Index index), where is the value String stored?

I understand that the chosen analyzer does its processing on that value,
including tokenization, and returns a TokenStream from which the Indexer
retrieves the attributes that it stores in the index.
When I use a binary editor to inspect the term infos (tis) file in the
index directory, I can see every single token (term).
For experimenting purposes, I implemented an analyzer that converts the
value input to the field and noticed the following: the TokenStream
still correctly generates the terms that end up to be stored in the tis
file, but the initial input value is still displayed as the field value
when I retrieve a document from the index and output it with
Document.toString(). I tried to analyse the Field's tokenStream, but
tokenStreamValue() returns null; is that normal when retrieving a
document from an existing index?

Can someone let me know what happens to a Field's value string and at
which point in the pipeline it is replaced by the (term) attributes
generated by the TokenStream?

Thank you very much!
Best,
Carsten


--
Carsten Schnober
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP -- Korpusanalyseplattform der nächsten Generation
http://korap.ids-mannheim.de/ | Tel.: +49-(0)621-1581-238

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe [at] lucene
For additional commands, e-mail: java-user-help [at] lucene

Subject User Time
Field value vs TokenStream schnober at ids-mannheim Apr 18, 2012, 8:00 AM
    RE: Field value vs TokenStream uwe at thetaphi Apr 18, 2012, 11:06 AM
        Re: Field value vs TokenStream schnober at ids-mannheim Apr 20, 2012, 1:56 AM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.