Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User

Field.Text(String, Reader) vs. Field.Text(String, String)

 

 

Lucene java-user RSS feed   Index | Next | Previous | View Threaded


nelson at monkey

Sep 28, 2001, 2:09 PM

Post #1 of 1 (838 views)
Permalink
Field.Text(String, Reader) vs. Field.Text(String, String)

One strange "gotcha" in Lucene:
Field.Text(String, String) creates a field that is stored, but
Field.Text(String, Reader) creates a field that is *not* stored.

I'd naively assumed that the two Field.Text() methods were just for
convenience, I hadn't expected that the semantics would change
depending on what way I got the data into the Field. That seems like a
misfeature to me.


For my own purposes I did a little chart of all 8 possible field
types, and which factory methods create which. The docs are all
consistent with the code, just seeing it this way made it clearer to me.

Stored Indexed Tokenized
yes yes yes Field.Text(String, String, String)
yes yes no Field.Keyword(String, String)
yes no yes
yes no no Field.UnIndexed(String, String)
no yes yes Field.Field(String, String) , Field.UnStored(String, String),Field.Text(String, Reader)
no yes no
no no yes
no no no

One thing that pops out is that it never makes sense to tokenize but
not index. Similarly, it never makes sense to neither store nor index.
This all seems obvious in retrospect, but it's helped me understand
how tokenizing, indexing, and storing all fit together.

nelson [at] monkey
. . . . . . . . http://www.media.mit.edu/~nelson/

Lucene java-user RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.