
marvin at rectangular
Sep 14, 2008, 4:36 PM
Views: 11787
Permalink
|
On Sep 13, 2008, at 1:56 PM, Dan wrote: > So now I have made claims... :) > I'll try to give more details. In my book, benchmarking claims presented without code, corpus, stats, raw data, and detailed methodological descriptions qualify as "anecdotal evidence". If you have a scientific background, you know what that means: not to be ignored, but requiring a high degree of skepticism and not particularly useful. > So as you can see this whole "test" is pretty simple with many > possible holes to try and get this Apples Vs Oranges test running. KinoSearch is a low-level engine analogous to Lucene; Solr is a higher- level library built on top of Lucene that does a lot of extra stuff, including copious caching. A comparison of Lucene to KinoSearch would be more germane from a development standpoint. By using Solr rather than Lucene, you've polluted the experiment with an extra layer of variables. I actually think that testing with all of Solr's default caching mechanisms *on* would be more interesting in a sense than what we've gotten from you so far. It wouldn't be helpful for development in terms of identifying optimization opportunities within KS, but it might be more interesting for decision makers. > Is there anything I can do to make these searches perform better? There are a couple of known issues that on the todo list that affect search speed. One is a bugfix (SegPList_Skip_To had to be temporarily disabled due to corrupt .skip files), and the other is a design flaw, described in <http://www.mail-archive.com/java-dev [at] lucene/msg15825.html >. Additionally, implementing the PForDelta compression algorithm for postings should speed up searching, but I'd planned to put that off. However, measuring progress on those issues using a closed source benchmark with "many possible holes" would be foolish. If we're going to do benchmarking at all, we're going to do it right: <http://www.rectangular.com/kinosearch/benchmarks.html >. Marvin Humphrey Rectangular Research http://www.rectangular.com/ _______________________________________________ KinoSearch mailing list KinoSearch [at] rectangular http://www.rectangular.com/mailman/listinfo/kinosearch
|