Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DBMail: dev

Re: [Dbmail] New idea for 2.3x, "shared quotas" [moved to dev list]

 

 

DBMail dev RSS feed   Index | Next | Previous | View Threaded


dbmail-dev at tech

Jan 24, 2008, 3:29 AM

Post #1 of 9 (535 views)
Permalink
Re: [Dbmail] New idea for 2.3x, "shared quotas" [moved to dev list]

Paul J Stevens wrote:
> The single-instance storage is pretty much done now. The next big milestone will
> be some form of database connection pooling so we can scale out the number of
> concurrent connected clients without draining the database backend.
>
Paul,

Do you have any plans for the db connection pooling as yet?

Thought of having a separate dbmail-dbpoold (or something similar?) and
handling the communication from dbmail-smtp, dbmail-imapd, dbmail-pop3d
etc.. via an IPC socket to the dbmail-dbpoold daemon? Could be a good
place to add caching as well, before it hits the db server?

Just a thought.

B.Regards,

SG
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


paul at nfg

Jan 24, 2008, 5:40 AM

Post #2 of 9 (512 views)
Permalink
Re: multifoo architecture [In reply to]

Simon Gray wrote:
> Do you have any plans for the db connection pooling as yet?

Sure do. The plan is as follows:

I'm well underway with making dbmail fully event-driven using libevent. This
will enable dbmail daemons to handle a lot of clients simultaneously on each
preforked child. When that is finished (targetted at 2.3.3) forked children will
be idle unless busy processing data or handling IO on the database channel.

where we now have a 1-1-1 relation between client <-> dbmail process <->
database connection we will move to a N-1-1 situation.

When this process is finished testing will have to determine how the services
behave under load. Contention over the database connector will then become the
main bottleneck I suspect, but we'll be able to handle a lot more client
concurrency, especially when combining AIO with a preforking setup. We won't hit
C10K just yet though.

Taking this one step further, the database code will have to become either
event-driven or threaded or both. Since mysql does not support callbacks on
queries using threads is the only option for now. So then each dbmail process
will maintain a pool of N database connections and that will give us N-1-N in a
non-preforked situation. Again, the characteristics of such a design will have
to be tested and profiled (preforked and non-preforked) to see where we will go
from there, if anywhere.

Also, on a sidetrack, using prepared statements in the database code will speed
things up a little as well. In fact, with the single-instance-storage stuff the
sqlite driver requires prepared statements or sqlite will fail to store messages
with attachments larger than 1MB or so.

But for now, I'm still working on the libevent refactoring of the server. Very
cool stuff.

--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


dbmail-dev at tech

Jan 25, 2008, 1:17 PM

Post #3 of 9 (508 views)
Permalink
Re: multifoo architecture [In reply to]

Paul J Stevens wrote:
> Simon Gray wrote:
>
>> Do you have any plans for the db connection pooling as yet?
>>
>
> Sure do. The plan is as follows:
>
> I'm well underway with making dbmail fully event-driven using libevent. This
> will enable dbmail daemons to handle a lot of clients simultaneously on each
> preforked child. When that is finished (targetted at 2.3.3) forked children will
> be idle unless busy processing data or handling IO on the database channel.
>
> where we now have a 1-1-1 relation between client <-> dbmail process <->
> database connection we will move to a N-1-1 situation.
Ah fantastic, good to know.

You might also be pleased to know mysql 6.0 uses libevent as well -
http://krow.livejournal.com/572937.html

SG
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


paul at nfg

Feb 3, 2008, 8:25 AM

Post #4 of 9 (489 views)
Permalink
Re: multifoo architecture [In reply to]

Paul J Stevens wrote:
> Simon Gray wrote:
>> Do you have any plans for the db connection pooling as yet?
>
> Sure do. The plan is as follows:
>
> I'm well underway with making dbmail fully event-driven using libevent. This
> will enable dbmail daemons to handle a lot of clients simultaneously on each
> preforked child. When that is finished (targetted at 2.3.3) forked children will
> be idle unless busy processing data or handling IO on the database channel.
>
> where we now have a 1-1-1 relation between client <-> dbmail process <->
> database connection we will move to a N-1-1 situation.

Just to give you all an update. I've just reached a milestone by having
Timo's imaptest tool run with 20 concurrent imap connections against a
*single* dbmail-imapd process without triggering any errors. I haven't
even tried with higher concurrencies :-) but above c15 the query code
appears to cause stalls in the client connections (mostly because the
server is too busy which leads to contention over the database
connector). This is mostly because imaptest is keeping the server
*really* busy.

So we now have a N-1-1 situation supporting around 20 or more clients
per dbmail process.

Next I need to bring the other daemons up to speed as well.

I'll do some more tests and will keep you all posted.

--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


paul at nfg

Feb 4, 2008, 8:15 AM

Post #5 of 9 (487 views)
Permalink
Re: multifoo architecture [In reply to]

Paul J Stevens wrote:

> I'll do some more tests and will keep you all posted.

After more testing, I've concluded that preforking is a dead-end.

It doesn't solve throughput issues wrt the database backend.
It introduces a lot of problems with synchronizing the forked children which
pollutes the imap states.

Handling N clients on a single process with a single database connector is
*very* easy to optimize. Caching /cacheable/ query results then suddenly becomes
trivial, and synchronizing imap states of mailboxes, messages, flags, etc
suddenly becomes doable, rather than neigh impossible.

So, I'm going to bite the bullet and nuke the preforking code.

I'm currently handling 100 concurrent clients with imaptest with low to moderate
command load without problems.

Using:

./imaptest clients=100 - logout=0 status=50 noop=50 delay=100

the postgres server (and old dual-pIII, u160, 1GB, raid1), using a single
postmaster process, never exceeds a load of 0.7, whereas the dbmail server
process running on my workstation is hitting 10% CPU. Duh.

And I havent even begun optimizing the query stream, because I got sidetracked
trying to fix the preforking code. No more!

later,

--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


paul at nfg

Feb 15, 2008, 1:21 AM

Post #6 of 9 (459 views)
Permalink
Re: multifoo architecture [In reply to]

Hi all,

Just sending another little update since I merged the events branch into HEAD
late last night and is on track for 2.3.3.

Apart from being event-driven, I've also refactored the imap code to be much
better at imap compliance. The imaptest tool (imapwiki.org) now passes all tests
except checkpointing.

This means that the whole server code has undergone some quite massively
invasive changes. As I explained earlier, the HEAD code now uses a single
process and a single database connector per daemon to handle /all/ incoming
clients. Finally, I've begun putting in some infrastructure for a simple global
(query) cache that may one day start talking to a memcache backend like we've
been discussing for ages.

Though performance is quite nice (and already way beyond 2.2) in most lab-bases
scenarios, this approach has also introduced some very obvious scalability
limitations by design:

- a single process is doesn't scale in multi-core systems
- a single database connector doesn't scale in case of longer running queries
like complex searches.

Especially the second limitation will need to be addressed before 2.4.

Still, dispite these limitations I'm exctatic about the state of things. The
clutter in the servercode is gone leaving a thin and easily maintainable layer.
And libevent is truly a rock solid layer for driving network IO; I love it!


later,

--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


aaron at serendipity

Feb 15, 2008, 2:42 AM

Post #7 of 9 (457 views)
Permalink
Re: multifoo architecture [In reply to]

Just spent some time reading code. Looks awesome! Well done!

What I'm gathering is that at the moment we're still single threaded,
but because of libevent, we never have to worry about managing
non-blocking -- calls to eventbuffer_write never block, and
dbmail_imap_session_set_callbacks sets up the imap parsing as a callback
when data is available, with an imap session associated as an opaque
handle on each socket. Am I reading right? This looks really good so
far.

If I can get my hands on some time, it'd be fun to tack the monitoring
back on not as a thread, but as another file handle in the event loop. I
wonder if libevent knows how to mix and match unix domain sockets with
tcp sockets. It'd be neat to turn /var/run/dbmail-<service>.state into a
socket that outputs the state of all imap sessions when it is read.

Aaron


On Fri, 2008-02-15 at 10:21 +0100, Paul J Stevens wrote:
> Hi all,
>
> Just sending another little update since I merged the events branch into HEAD
> late last night and is on track for 2.3.3.
>
> Apart from being event-driven, I've also refactored the imap code to be much
> better at imap compliance. The imaptest tool (imapwiki.org) now passes all tests
> except checkpointing.
>
> This means that the whole server code has undergone some quite massively
> invasive changes. As I explained earlier, the HEAD code now uses a single
> process and a single database connector per daemon to handle /all/ incoming
> clients. Finally, I've begun putting in some infrastructure for a simple global
> (query) cache that may one day start talking to a memcache backend like we've
> been discussing for ages.
>
> Though performance is quite nice (and already way beyond 2.2) in most lab-bases
> scenarios, this approach has also introduced some very obvious scalability
> limitations by design:
>
> - a single process is doesn't scale in multi-core systems
> - a single database connector doesn't scale in case of longer running queries
> like complex searches.
>
> Especially the second limitation will need to be addressed before 2.4.
>
> Still, dispite these limitations I'm exctatic about the state of things. The
> clutter in the servercode is gone leaving a thin and easily maintainable layer.
> And libevent is truly a rock solid layer for driving network IO; I love it!
>
>
> later,
>

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


paul at nfg

Feb 15, 2008, 4:06 AM

Post #8 of 9 (459 views)
Permalink
Re: multifoo architecture [In reply to]

Aaron Stone wrote:
> Just spent some time reading code. Looks awesome! Well done!

Glad you like it.

> What I'm gathering is that at the moment we're still single threaded,

yep.

> but because of libevent, we never have to worry about managing
> non-blocking -- calls to eventbuffer_write never block, and
> dbmail_imap_session_set_callbacks sets up the imap parsing as a callback
> when data is available, with an imap session associated as an opaque
> handle on each socket. Am I reading right? This looks really good so
> far.


dbmail_imap_session_set_callback is used only at the beginning of a
clientsession. The _only_ exception is IDLE. Since IDLE requires /very/ specific
client interactions, and does not allow interleaving with other commands, the
'normal' client callbacks as setup in imap4.c is overriden, and reset after the
client terminates IDLE to resume normal command interaction. In general, you
don't want to mess with callbacks.

This has no equivalent in the other daemons; they all use fixed callbacks for
read/write/error which never change during the duration of the clientsession's
lifetime. The read callbacks use state-machines to keep track of what they are
doing and launch command handlers once the command parser is satisfied it has
received a full command (+data). You may want to checkout how lmtp handles the
DATA command, or how imap and timsieve handle string literals to wait for
additional data.

Basically, what a read callback does:

while (readline_from_client(client_session, buffer)) { // got some
if (tokenize(client_session, buffer)) { // done for now?
do_command(client_session);
reset_parser(client_session);
}
}

pop3 is slightly different (simpler) in that in POP3 every line read from a
client constitutes a complete command: no multiline data or continuation like in
string literals or lmtp-DATA.

>
> If I can get my hands on some time, it'd be fun to tack the monitoring
> back on not as a thread, but as another file handle in the event loop. I
> wonder if libevent knows how to mix and match unix domain sockets with
> tcp sockets. It'd be neat to turn /var/run/dbmail-<service>.state into a
> socket that outputs the state of all imap sessions when it is read.

No problem at all. The only filehandles you can't use with the bufferevent api
are pipes(2). With libevent you can still use pipes if you want, but you'll need
to use the underlying event_set api. And you can easily allow run-time
configuration of the logging/monitoring socket. I'm a big fan of UDP streams for
aggragating such data, but any filehandle will do.


--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev


paul at nfg

May 4, 2008, 11:06 AM

Post #9 of 9 (331 views)
Permalink
Re: multifoo architecture [In reply to]

Hi all,

Thought I'd bring you up to speed with regard to the state of things to
come. Even though it's been a while since I released 2.3.2, I haven't
excactly been sitting around idly.

- libzdb

Dbmail has has now fully switched to using libzdb for driving the
database layer. It's a beautiful little library that provides a simple
and elegant API for thread-safe connection pools, prepared statements
and exception handling.

During this transition I pushed libzdb's envelope a little, mainly
because our usage pattern differs significantly from what the libzdb
community normally sees. What this boils down to was that I was able to
expose a couple of important bugs in the libzdb code, each of which was
promptly fixed by the upstream developers. In many ways the release of
libzdb-2.2.1 today cleared the road for dbmail-2.3.3.

- threading

This is where most of the work before 2.4.0 still needs to be done.
Rather that use a thread-per-client (crude, simple, and very difficult
to do bug-free without doing massive amounts of mutex locking), I
decided to go for a more finegrained pattern.

All network IO will be done by the main thread, but blocking tasks
(mainly database related) will be deferred to threads as much as
possible. Synchronizing with the main thread is being done with the
GAsyncQueue api from glib which provides a thread-safe mechanism for
inter-thread communication. A worker thread is spawned, does it's thing,
builds up a result dataset, pushes this data onto the async queue, and
notifies the main event-loop in the main thread using a self-pipe. The
main event loop pops an element from the queue, and runs the callbacks
that are part of the data element to process the result. This is a often
used pattern that is sometimes called thread-enter/thread-leave.

Anyway, even though only a very few IMAP commands have gone threaded,
throughput under high concurrencies is quite good at the moment. And
with the stabilization of the database layer with the new release of
libzdb, I think the time for releasing 2.3.3 is upon us.

stay tuned.


--
________________________________________________________________
Paul Stevens paul at nfg.nl
NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31
The Netherlands________________________________http://www.nfg.nl
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev[at]dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev

DBMail dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.