marvin at rectangular
Mar 9, 2005, 3:50 PM
Post #1 of 1
On Mar 9, 2005, at 1:18 PM, Chris Nandor wrote:
> Hi Marvin, how's Kinosearch progressing? Just checking in. I've been
> with other projects, but might be getting back to search stuff soon.
Another work project forced its way onto the front burner, displacing
Kinosearch for much of the last month. That project, which peaked last
night, will cause me to miss my deadline of CPAN release by March 11.
Nevertheless, a significant amount of progress has been made on
Kinosearch since we last communicated.
The query parser and the search-time scoring routine have been
rewritten and the groundwork laid for further optimization -- there's
going to have to be XS code eventually in order to max out search-time
speed. Boolean queries work. Searches should be faster, as there are
fewer sorts and hash lookups. Search time memory requirements should
be diminished. Indexing should be faster, too, since I've gone with
Storable instead of the custom serializer + zlib scheme that was there
before. aux_index is gone, and in its place is another experiment,
bitwise_index, which may meet your needs for multiple categorization.
I've also rethought some architectural constraints that guided early
-- At this point, I don't think I will worry about making Kinosearch
infinitely scalable. Lucene can plow that territory. If a 32-bit
limitation on the number of documents or the number of terms matters to
you, you probably can afford several Java programmers.
-- I'd originally set up Kinosearch to use an index format which was
(conceptually) portable to other languages. I've decided to abandon
that idea. Kinosearch will be Perl. It will be as compact, efficient,
and flexible as possible, within Perl.
-- Kinosearch will need a single-file index format that will be its
default, but that format probably will not be based on the current code
for Search::Kinosearch::Backend. I think that code is going away.
A kinosearch [at] rectangular mailing list has been set up, to which
I've cc'd this message. Nobody's subscribed yet, and I've just now
posted a link. If you object to this reply showing up in the archives,
let me know and I'll expunge it.