
marvin at rectangular
Jul 11, 2006, 9:19 PM
Post #4 of 5
(136 views)
Permalink
|
On Jul 11, 2006, at 6:24 AM, henka [at] cityweb wrote: > Hello, > > How can one influence the ranking of results? Let's say you have a > special field with an integer value which is determined not by the > indexer, but by some other algorithm, and when this index page is hit > because of a standard search query, you would like this special > field to > influence the ranking. > > Can this be achieved? At present, only with hacks, and not for large datasets. You've described the Sort functionality we've been discussing in other threads. It's implemented in Lucene using a FieldCache, which is what we've been faking using Perl arrays. FieldCache in Lucene is an array of field values, just like here, but instead of being retrieved from stored documents as Gavin's doing, the values are loaded from the term dictionary and are parsed as either integers, floats, or strings, and an array of the low-level data type is built up. This technique is still memory-intensive for large document collections, but considerably less so than using a Perl array. Inverted indexes excel at relevance scoring. Sorting on secondary fields while reading from disk is not their forte, because that information is not normally housed in the data structures used for scoring and heavily optimized for speed. However, if you have the memory and the time to pre-load, the FieldCache technique is quite efficient. It ought to be faster, for instance, than naively sorting document numbers obtained during a search against rows in a flat file database -- because if you already have the sort field values, all you need are document numbers, and those are quicker to extract from a Lucene index than from a data file with fixed width rows containing other information. The documentation for Lucene's Sort class explains some of the caveats around selecting a field which you would load into a FieldCache. http://lucene.apache.org/java/docs/api/org/apache/lucene/search/ Sort.html Marvin Humphrey Rectangular Research http://www.rectangular.com/
|