Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DBMail: users

Feature request: more caching

 

 

DBMail users RSS feed   Index | Next | Previous | View Threaded


casper at bcx

Apr 24, 2008, 4:52 AM

Post #1 of 4 (124 views)
Permalink
Feature request: more caching

Hi Paul and others!

I've been very busy making a webmail interface directly into the dbmail
database backend, and that has given me some time to gather some
thoughts on the table layout.

For a typical email client there are to things that need to be
optimized, which currently are quite slow:

One is the messagecount (read messages and total) that are displayed for
a dbmail_mailbox. Counting them everytime is quite a strain on the
database.

Secondly, the data retrieved for creating a message row with from
address. subject, date. message size etc is scattered out along a few
tables. The query needed for this pretty slow. as it contains a fair
number of join's. What I'd like to have is a simple table per
physmessage (or even _in_ dbmail_physmessage) containing the relevant
information.

Could using stored procedures be a solution to this problem?

Grtz, Casper


paul at nfg

Apr 24, 2008, 5:59 AM

Post #2 of 4 (117 views)
Permalink
Re: Feature request: more caching [In reply to]

Casper Langemeijer wrote:
> Hi Paul and others!
>
> I've been very busy making a webmail interface directly into the dbmail
> database backend, and that has given me some time to gather some
> thoughts on the table layout.
>
> For a typical email client there are to things that need to be
> optimized, which currently are quite slow:
>
> One is the messagecount (read messages and total) that are displayed for
> a dbmail_mailbox. Counting them everytime is quite a strain on the database.

That's why in 2.3+ the dbmail_mailbox rows will have a datetime field (_mtime),
so you can reliably cache results. This timestamp field ma be changed into a
sequence that is updated after each modification of (messages in) that mailbox
to accomodate a RFC for highly concurrent access to shared mailboxes. But this
will be solvable.

>
> Secondly, the data retrieved for creating a message row with from
> address. subject, date. message size etc is scattered out along a few
> tables. The query needed for this pretty slow. as it contains a fair
> number of join's. What I'd like to have is a simple table per
> physmessage (or even _in_ dbmail_physmessage) containing the relevant
> information.
>
> Could using stored procedures be a solution to this problem?

That join should not be *that* slow. It essentially only involves the
dbmail_headervalues and physmessage tables.

Also, duplication of cached data is *evil*. Consider using the dbmail_envelope
table which contains all the senders, the recipients, date, subject and
message-id information. Of course, it's in a IMAP compatible format, so you'll
need to parse it. One tip: don't use regexp to do that. Write a tokenizer.

Or write a view.


--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
https://mailman.fastxs.nl/mailman/listinfo/dbmail


casper at bcx

Apr 24, 2008, 1:59 PM

Post #3 of 4 (115 views)
Permalink
Re: Feature request: more caching [In reply to]

Hi Paul,

Thanks for your quick reply.

On Thu, 2008-04-24 at 14:59 +0200, Paul J Stevens wrote:

> > One is the messagecount (read messages and total) that are displayed for
> > a dbmail_mailbox. Counting them everytime is quite a strain on the database.
>
> That's why in 2.3+ the dbmail_mailbox rows will have a datetime field (_mtime),
> so you can reliably cache results. This timestamp field ma be changed into a
> sequence that is updated after each modification of (messages in) that mailbox
> to accomodate a RFC for highly concurrent access to shared mailboxes. But this
> will be solvable.

Sounds neat!


> > Secondly, the data retrieved for creating a message row with from
> > address. subject, date. message size etc is scattered out along a few
> > tables. The query needed for this pretty slow. as it contains a fair
> > number of join's. What I'd like to have is a simple table per
> > physmessage (or even _in_ dbmail_physmessage) containing the relevant
> > information.
> >
> > Could using stored procedures be a solution to this problem?
>
> That join should not be *that* slow. It essentially only involves the
> dbmail_headervalues and physmessage tables.
>
> Also, duplication of cached data is *evil*. Consider using the dbmail_envelope
> table which contains all the senders, the recipients, date, subject and
> message-id information. Of course, it's in a IMAP compatible format, so you'll
> need to parse it. One tip: don't use regexp to do that. Write a tokenizer.

That's exactly what I wanted! I've completely overlooked that one, I'm
very sorry.

Thanks, Casper


paul at nfg

Apr 25, 2008, 12:13 AM

Post #4 of 4 (111 views)
Permalink
Re: Feature request: more caching [In reply to]

Casper Langemeijer wrote:

>> Also, duplication of cached data is *evil*. Consider using the dbmail_envelope
>> table which contains all the senders, the recipients, date, subject and
>> message-id information. Of course, it's in a IMAP compatible format, so you'll
>> need to parse it. One tip: don't use regexp to do that. Write a tokenizer.

> That's exactly what I wanted! I've completely overlooked that one, I'm
> very sorry.

I thought about it a little more, and came to the conclusion that you probably
don't want to use the _envelope table after all for two main reasons:

- the content is quite difficult to parse. Like I said, you'll need to write a
parser/tokenizer which is doable, but not trivial.

- the content is semi-raw: though 8bit data in the fields will be utf8 encoded
in the database, utf7 encoded subjects and addresses will be in their utf7
state. The headervalue table on the other hand is fully decoded and recoded as
utf8: ready for presentation in a webapp.

--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
https://mailman.fastxs.nl/mailman/listinfo/dbmail

DBMail users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.