
madhusasidhar at gmail
Jul 18, 2005, 1:56 PM
Post #2 of 2
(1048 views)
Permalink
|
Rajesh I am not sure what your eventual goal is - but it looks like you are using Lucene is some sort of Natural Language Processing environment - I am doing something similar - with dotLucene. Possibly the SpanQuery is what you want that will let you specify the Span - hence 1-gram, 2-gram etc. Email me if you want samples (C#) Madhu On 7/18/05, Rajesh Munavalli <rajeshm [at] dessci> wrote: > > At what point do I add n-grams? Does the order in which I add n-grams > affect exact phrase queries later? My questions are > > (1) Should I add all the 1-grams followed by 2-grams followed by > 3-grams..etc sentence by sentence OR > (2) Add all the 1 grams of entire document first before starting 2-grams > for the entire document? > > What is the general accepted notion of adding n-grams of a document? > > thanks, > > Rajesh > >
|