Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: kinosearch: discuss

Re: Kinosearch

 

 

kinosearch discuss RSS feed   Index | Next | Previous | View Threaded


marvin at rectangular

Mar 9, 2005, 3:50 PM

Post #1 of 1 (694 views)
Permalink
Re: Kinosearch

On Mar 9, 2005, at 1:18 PM, Chris Nandor wrote:

> Hi Marvin, how's Kinosearch progressing? Just checking in. I've been
> busy
> with other projects, but might be getting back to search stuff soon.

Another work project forced its way onto the front burner, displacing
Kinosearch for much of the last month. That project, which peaked last
night, will cause me to miss my deadline of CPAN release by March 11.
Nevertheless, a significant amount of progress has been made on
Kinosearch since we last communicated.

The query parser and the search-time scoring routine have been
rewritten and the groundwork laid for further optimization -- there's
going to have to be XS code eventually in order to max out search-time
speed. Boolean queries work. Searches should be faster, as there are
fewer sorts and hash lookups. Search time memory requirements should
be diminished. Indexing should be faster, too, since I've gone with
Storable instead of the custom serializer + zlib scheme that was there
before. aux_index is gone, and in its place is another experiment,
bitwise_index, which may meet your needs for multiple categorization.

I've also rethought some architectural constraints that guided early
development.

-- At this point, I don't think I will worry about making Kinosearch
infinitely scalable. Lucene can plow that territory. If a 32-bit
limitation on the number of documents or the number of terms matters to
you, you probably can afford several Java programmers.

-- I'd originally set up Kinosearch to use an index format which was
(conceptually) portable to other languages. I've decided to abandon
that idea. Kinosearch will be Perl. It will be as compact, efficient,
and flexible as possible, within Perl.

-- Kinosearch will need a single-file index format that will be its
default, but that format probably will not be based on the current code
for Search::Kinosearch::Backend. I think that code is going away.

A kinosearch [at] rectangular mailing list has been set up, to which
I've cc'd this message. Nobody's subscribed yet, and I've just now
posted a link. If you object to this reply showing up in the archives,
let me know and I'll expunge it.

Best,

--
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/

kinosearch discuss RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.