
marvin at rectangular
Jun 21, 2007, 4:30 PM
Post #2 of 3
(606 views)
Permalink
|
On Jun 21, 2007, at 3:25 PM, Chris Nandor wrote: > Can't locate object method "get_size" via package > "KinoSearch::Index::MultiLexicon" at > /usr/local/lib/perl5/site_perl/5.8.4/darwin-2level/KinoSearch/ > Search/RangeFilter.pm > line 159, <GEN0> line 1. Mmf. OK, no big deal. This is much easier to solve than the last one you threw my way. :) A Lexicon's "size" is the number of terms it holds. We can't know the size of a MultiLexicon until we've iterated over the entire thing once. We can know the number of terms each SegLexicon in the MultiLexicon holds, but we don't know how many terms overlap. The iterator uses a PriorityQueue which checks for duplicates, though, so if we start at the top and count how many times Lex_Next (multi_lexicon) returns true, we have the size. Fortunately, by this point, we'll have already performed that iteration -- during the call to build_sort_cache. What we need to do is add a self->size member var to the MultiLexicon struct, then set it to self->term_num as soon as the iteration finishes in MultiLex_build_sort_cache. The actual accessor will look like this: i32_t MultiLex_get_size(MultiLexicon *self) { if (self->lex_cache == NULL) CONFESS("Can't call MultiLex_Size unless cache filled"); return self->size; } We should add a Lex_Get_Size abstract accessor to Lexicon.c/h, along with an XS hook in Lexicon.pm which both SegLexicon and MultiLexicon will inherit. We should zap the current XS hook in SegLexicon.pm and replace it with an implementation of Lex_Get_Size in SegLexicon.c/h. I have a deadline tomorrow, so I don't think I'll get to adding this code and the accompanying tests before the weekend. Marvin Humphrey Rectangular Research http://www.rectangular.com/
|