
marvin at rectangular
Jun 12, 2008, 12:53 PM
Post #4 of 4
(1114 views)
Permalink
|
|
Re: Feature request: Search facet counts in Kinosearch?
[In reply to]
|
|
On Jun 12, 2008, at 11:16 AM, Nathan Kurz wrote: > Is the faceted approach layered on top of the search as > a post-processing filter, or are the facets being handled directly by > the search engine? The main trick for obtaining the facet counts is massive server-side caching. You cache doc sets for each facet. A BitVector works well for facets which match lots of documents; for more sparse sets, a SortedVIntList, which encodes a set of integers using a compressed format, may use less memory. When you search, you use a dual-purpose HitCollector which wraps both a TopDocCollector and a BitCollector. The TopDocCollector gets you your standard search results ranked by score. The BitCollector gets you a list of all the doc numbers that matched. For each facet that you want a result for, you count the number of docs in the intersection of the main result set with the facet's cached result set. The other problem is how to decide which facets to evaluate each query against. I think most people use sort of drill-down, where top-level queries are compared against general categories, and once you select one of those categories (e.g. by clicking on "DVDs", or "Books"), the facet set changes. However, I don't believe that Solr constrains you with regard to how you select facets. Marvin Humphrey Rectangular Research http://www.rectangular.com/ _______________________________________________ KinoSearch mailing list KinoSearch [at] rectangular http://www.rectangular.com/mailman/listinfo/kinosearch
|