marvin at rectangular
Jun 12, 2008, 12:53 PM
Post #4 of 4
On Jun 12, 2008, at 11:16 AM, Nathan Kurz wrote:
Re: Feature request: Search facet counts in Kinosearch?
[In reply to]
> Is the faceted approach layered on top of the search as
> a post-processing filter, or are the facets being handled directly by
> the search engine?
The main trick for obtaining the facet counts is massive server-side
You cache doc sets for each facet. A BitVector works well for facets
which match lots of documents; for more sparse sets, a SortedVIntList,
which encodes a set of integers using a compressed format, may use
When you search, you use a dual-purpose HitCollector which wraps both
a TopDocCollector and a BitCollector. The TopDocCollector gets you
your standard search results ranked by score.
The BitCollector gets you a list of all the doc numbers that matched.
For each facet that you want a result for, you count the number of
docs in the intersection of the main result set with the facet's
cached result set.
The other problem is how to decide which facets to evaluate each query
against. I think most people use sort of drill-down, where top-level
queries are compared against general categories, and once you select
one of those categories (e.g. by clicking on "DVDs", or "Books"), the
facet set changes. However, I don't believe that Solr constrains you
with regard to how you select facets.
KinoSearch mailing list
KinoSearch [at] rectangular