
michael.monnerie at is
Jan 19, 2009, 6:51 AM
Post #6 of 12
(1939 views)
Permalink
|
|
Re: Improving headervalues table, was: Who starts this search?
[In reply to]
|
|
On Montag 19 Januar 2009 Paul J Stevens wrote: > That's because currently for search each search term is search for > separately, after which the search results are merged. In this > particular case the search should be optimized to join the message > idnr as well (UID xxx) which would drastically reduce the table scan. Yes, especially if there are several thousand messages in a box. > I've done a stable at improving search speeds once, but getting the > possibly recursive search terms translated into a correct query is > not a trivial thing. What can be recursive? And wouldn't it be better to throw all search terms at the DB and it should say what it finds? That's what we have SQL for. > You may want to: > delete from dbmail_headervalue where headervalue = ''; Yes, as a workaround. But doesn't "dbmail-util -ay" recreate those headers? > Compressing the headername/headervalue tables would be nice. But I > don't have the resources to do such a thing at this time. Maybe we could find help for this problem? I'll try to do the SQL part: We currently have physmessage -> headervalue -> headername If we split it up, we can do physmessage -> headernamevalue -> headername+headervalue CREATE TABLE headernamevalue ( id INT8 NOT NULL REFERENCES dbmail_physmessage(id) ON UPDATE CASCADE ON DELETE CASCADE, nameid INT8 NOT NULL REFERENCES dbmail_headername(nameid) ON UPDATE CASCADE ON DELETE RESTRICT, value id INT8 NOT NULL REFERENCES dbmail_headervalue(valueid) ON UPDATE CASCADE ON DELETE RESTRICT, PRIMARY KEY (id,name,value) ); CREATE SEQUENCE dbmail_headernameid_seq; CREATE TABLE dbmail_headername ( nameid INT8 NOT NULL DEFAULT nextval('dbmail_headernameid_seq'), name VARCHAR(100) NOT NULL DEFAULT 'BROKEN_HEADER', PRIMARY KEY (nameid) ); CREATE UNIQUE INDEX dbmail_headername_1 on dbmail_headername(name); CREATE SEQUENCE dbmail_headervalueid_seq; CREATE TABLE dbmail_headervalue ( valueid INT8 NOT NULL DEFAULT nextval('dbmail_headervalueid_seq'), value TEXT NOT NULL DEFAULT '', PRIMARY KEY (valueid) ); CREATE UNIQUE INDEX dbmail_headervalue_1 ON dbmail_headervalue(value); With that construct, a value or name only needs to be inserted once, and only it's reference is stored into headernamevalue. Shouldn't be a lot of work, but would save a lot of storage, and improve search speed tremendously. Currently there is substring(dbmail_headervalue(value),0,255) on the dbmail_headername_1 index, but I believe this could be dropped and all accesses with "substring" removed. I think it was only done because of a bug somewhere, or and index that could not be created? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ DBmail mailing list DBmail [at] dbmail https://mailman.fastxs.nl/mailman/listinfo/dbmail
|