Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DBMail: users

Very slow imap4 performance seen when importing mail. Possible config problem?

 

 

DBMail users RSS feed   Index | Next | Previous | View Threaded


dbmail-list at r

Aug 13, 2011, 4:10 AM

Post #1 of 19 (830 views)
Permalink
Very slow imap4 performance seen when importing mail. Possible config problem?

Hello there!

I'm just looking to sanity check something, as I've gone over my MySQL
configuration, recreated (and moved to faster storage) my databases +
whatnot, to no avail.

Basics of the hardware: Q6600 @ 3.1GHz, two cores allocated to a vmWare
Workstation 7.5.4 guest on a Windows 7 x64 host. There are two 7200 RPM
SATA-2 spindles on the machine, and the virtual hard disks within the
guest environment are placed on different spindles to help ensure
optimal performance.

The guest OS is Slackware 13.37 all patched up, 3GB RAM with HighMem
kernel using a 2G/2G split, 32-bit environment. Kernel is 2.6.35.7.

I had to build with the git version because rc2 doesn't compile at all
on this platform.

Dependency versions:

gmime-2.4.25
glib2-2.28.8
libevent-2.0.12
libsieve-2.2.7
libzdb-2.8.1
mysql-5.1.46-i486-2

--- my.cnf ---
[mysqld]
port = 3306
socket = /var/run/mysql/mysql.sock
skip-locking
key_buffer_size = 256M
max_allowed_packet = 20M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 16M
thread_concurrency = 4
tmpdir = /tmp/
#log-update = /path-to-dedicated-directory/hostname

innodb_data_home_dir = /mail/db
innodb_data_file_path = ibdata1:40M:autoextend
innodb_log_group_home_dir = /mail/db
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 256M
innodb_additional_mem_pool_size = 20M
# Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 64M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50

---------

The mysqld process has consumed nearly an hour of CPU time during this
process. dbmail is configured to use local sockets rather than network I/O.

I'm using the PERL MailTools http://search.cpan.org/dist/MailTools/
to import about 10 folders' worth of email, totaling about 560MB in raw
size, constituting about 23,000 emails. The script basically creates
the folders, and does an APPEND for each email. It's bog simple.

I DROP the database, recreated it, added the one user, verify DBMail
accepts authentication for the newly created mailbox, and then do the
import. The MySQL files live on a freshly formatted ext4 filesystem.

The import takes Dovecot (MailDir or mdbox format), or Panda IMAP (mix)
about six minutes to complete.

DBMail 3 took 4h 23m. Casual inspection of the system showed modestly
high CPU usage in mysqld and dbmail-imapd (as well as the import perl
command on occasion), but the Load Average didn't get too close to 1.0,
let alone 2.0, which concerns me that I might have hit some kind of
"busy wait" pathology.

Have I picked the "wrong day" to check out a git version? Have I
flubbed the MySQL configuration? The documentation says 250 inbound
emails/second is possible - is this with a vast cluster comprised of
very fast CPU/RAM/SSD hard disks?

I am doing some feature/performance analysis to narrow down a mail
backed selection, and DBMail sounds truly appealing. I'd like to find
out the problem to give this a fair evaluation.

Cheers!

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


h.reindl at thelounge

Aug 13, 2011, 9:01 AM

Post #2 of 19 (823 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

if it is a option for you to upgrade mysql to 5.5
i would strongly recommend this, our dbmail.machines
became a hughe perofrmance boost against 5.1.x

this are my innodb-settings fo the small server, using
compressed tables for some hundert physical addreses
and some thousand forwarders

some of this settings will boost 5.1 too, some of them
only supported in recent 5.5

innodb_buffer_pool_size = 1024M
innodb_buffer_pool_instances = 2
innodb_purge_threads = 1
innodb_max_purge_lag = 200000
innodb_max_dirty_pages_pct = 60
innodb_additional_mem_pool_size = 80M
innodb_log_file_size = 512M
innodb_log_buffer_size = 64M
innodb_thread_concurrency = 24
innodb_thread_sleep_delay = 10
innodb_flush_log_at_trx_commit = 2
innodb_support_xa = 1
innodb_lock_wait_timeout = 50
innodb_table_locks = 0
innodb_checksums = 1
innodb_file_format = barracuda
innodb_file_per_table = 1
innodb_open_files = 600
innodb_io_capacity = 300
innodb_read_io_threads = 6
innodb_write_io_threads = 6
transaction-isolation = READ-COMMITTED

Am 13.08.2011 13:10, schrieb Robin Horforth:
> Hello there!
>
> I'm just looking to sanity check something, as I've gone over my MySQL configuration, recreated (and moved to
> faster storage) my databases + whatnot, to no avail.
>
> Basics of the hardware: Q6600 @ 3.1GHz, two cores allocated to a vmWare Workstation 7.5.4 guest on a Windows 7
> x64 host. There are two 7200 RPM SATA-2 spindles on the machine, and the virtual hard disks within the guest
> environment are placed on different spindles to help ensure optimal performance.
>
> The guest OS is Slackware 13.37 all patched up, 3GB RAM with HighMem kernel using a 2G/2G split, 32-bit
> environment. Kernel is 2.6.35.7.
>
> I had to build with the git version because rc2 doesn't compile at all on this platform.
>
> Dependency versions:
>
> gmime-2.4.25
> glib2-2.28.8
> libevent-2.0.12
> libsieve-2.2.7
> libzdb-2.8.1
> mysql-5.1.46-i486-2
>
> --- my.cnf ---
> [mysqld]
> port = 3306
> socket = /var/run/mysql/mysql.sock
> skip-locking
> key_buffer_size = 256M
> max_allowed_packet = 20M
> table_open_cache = 256
> sort_buffer_size = 1M
> read_buffer_size = 1M
> read_rnd_buffer_size = 4M
> myisam_sort_buffer_size = 64M
> thread_cache_size = 8
> query_cache_size= 16M
> thread_concurrency = 4
> tmpdir = /tmp/
> #log-update = /path-to-dedicated-directory/hostname
>
> innodb_data_home_dir = /mail/db
> innodb_data_file_path = ibdata1:40M:autoextend
> innodb_log_group_home_dir = /mail/db
> # You can set .._buffer_pool_size up to 50 - 80 %
> # of RAM but beware of setting memory usage too high
> innodb_buffer_pool_size = 256M
> innodb_additional_mem_pool_size = 20M
> # Set .._log_file_size to 25 % of buffer pool size
> innodb_log_file_size = 64M
> innodb_log_buffer_size = 8M
> innodb_flush_log_at_trx_commit = 1
> innodb_lock_wait_timeout = 50
>
> ---------
>
> The mysqld process has consumed nearly an hour of CPU time during this process. dbmail is configured to use
> local sockets rather than network I/O.
>
> I'm using the PERL MailTools http://search.cpan.org/dist/MailTools/
> to import about 10 folders' worth of email, totaling about 560MB in raw size, constituting about 23,000 emails.
> The script basically creates the folders, and does an APPEND for each email. It's bog simple.
>
> I DROP the database, recreated it, added the one user, verify DBMail accepts authentication for the newly created
> mailbox, and then do the import. The MySQL files live on a freshly formatted ext4 filesystem.
>
> The import takes Dovecot (MailDir or mdbox format), or Panda IMAP (mix) about six minutes to complete.
>
> DBMail 3 took 4h 23m. Casual inspection of the system showed modestly high CPU usage in mysqld and dbmail-imapd
> (as well as the import perl command on occasion), but the Load Average didn't get too close to 1.0, let alone
> 2.0, which concerns me that I might have hit some kind of "busy wait" pathology.
>
> Have I picked the "wrong day" to check out a git version? Have I flubbed the MySQL configuration? The
> documentation says 250 inbound emails/second is possible - is this with a vast cluster comprised of very fast
> CPU/RAM/SSD hard disks?
>
> I am doing some feature/performance analysis to narrow down a mail backed selection, and DBMail sounds truly
> appealing. I'd like to find out the problem to give this a fair evaluation.
>
> Cheers!
>
> =R=
> _______________________________________________
> DBmail mailing list
> DBmail [at] dbmail
> http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail

--

Mit besten Grüßen, Reindl Harald
the lounge interactive design GmbH
A-1060 Vienna, Hofmühlgasse 17
CTO / software-development / cms-solutions
p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40
icq: 154546673, http://www.thelounge.net/

http://www.thelounge.net/signature.asc.what.htm
Attachments: signature.asc (0.26 KB)


paul at nfg

Aug 13, 2011, 10:57 AM

Post #3 of 19 (817 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 08/13/2011 01:10 PM, Robin Horforth wrote:

> The import takes Dovecot (MailDir or mdbox format), or Panda IMAP (mix)
> about six minutes to complete.

I'll take your word for it.

> DBMail 3 took 4h 23m. Casual inspection of the system showed modestly
> high CPU usage in mysqld and dbmail-imapd (as well as the import perl
> command on occasion), but the Load Average didn't get too close to 1.0,
> let alone 2.0, which concerns me that I might have hit some kind of
> "busy wait" pathology.

How many concurrent IMAP connections are you using? A load below 1
indeed sounds suspicious.

>
> Have I picked the "wrong day" to check out a git version? Have I
> flubbed the MySQL configuration? The documentation says 250 inbound
> emails/second is possible - is this with a vast cluster comprised of
> very fast CPU/RAM/SSD hard disks?

That number is about retrieval. And then only on POP3. And a very old
number (dbmail-1.0?).

I'd be curious about more up-to-date numbers.

Also, I'm not at all familiar with IO performance on VMWare, but on Xen
IO really lags behind bare-iron databases. And thats no joke. On Xen
hosts running mysql-5.5 I need to keep thread-concurrency at 1 (read:
one) to keep throughput at acceptable levels with IMAP concurrencies
around 50-100 connections, POP3 around 10-20, with an added 5 LMTP
connections for insertions.

That said: dbmail will never be able to achieve the kind of numbers you
mention for dovecot or panda. The single-instance storage used for both
mime-blobs and header values, and fully indexed headers prevent it.
DBMail insertion is not about creating simple files in directories.


--
________________________________________________________________
Paul J Stevens pjstevns @ gmail, twitter, skype, linkedin

* Premium Hosting Services and Web Application Consultancy *

www.nfg.nl/info [at] nfg/+31.85.877.99.97
________________________________________________________________
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


h.reindl at thelounge

Aug 13, 2011, 11:06 AM

Post #4 of 19 (820 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

Am 13.08.2011 19:57, schrieb Paul J Stevens:

> Also, I'm not at all familiar with IO performance on VMWare, but on Xen IO really lags
> behind bare-iron databases. And thats no joke. On Xen hosts running mysql-5.5 I need
> to keep thread-concurrency at 1 (read: one) to keep throughput at acceptable levels
> with IMAP concurrencies around 50-100 connections, POP3 around 10-20, with an added
> 5 LMTP connections for insertions

we have our whole business running on VMware ESXi
there is no difference between VMware-Guest and bare-iron

one of our dbmail-machines is running currently even on VMware-Workstation
on a CentOS 5.5 Host, working wonderful, i had only a real perfomrance problem
with a CentOS geust on this machine while Fedora is running smootly, maybe
there is some problem depending on the host-guest-combination which even
VMware-Support can not explain, running the same image on a Fedora-Host no
problems

conclusion: normally there is no problem with VMware, but for business
i would strongly recommend ESXi instead of workstation on certified
hardware (HP ProLiant in our environment), running the whole company
on this since 2008 without any issue and i am speeking about webservers,
dbmail, fileservers with samba and netatalk, barracuda spamfirewall as
virtual appliance... most of the guests are runnign smoother than on
bare metal especially boot-times
Attachments: signature.asc (0.26 KB)


dbmail-list at r

Aug 13, 2011, 7:50 PM

Post #5 of 19 (819 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/13/2011 10:57 AM, Paul J Stevens wrote:

> I'll take your word for it.

== Dovecot 2.0.13 ==

MDBOX (no SiS):

real 6m14.485s
user 2m26.425s
sys 0m20.389s
578548 /mail/home

== mdbox w/attachment SiS (sis posix mode, SHA1) ==

real 6m34.305s
user 2m28.025s
sys 0m16.030s

363724 /mail/attachments
157316 /mail/home/usertest
521040 total

MailDir (ext4 + dir_index, no SiS attachment scheme):

real 6m46.217s
user 2m26.521s
sys 0m16.109s
612992 /mail/home

> How many concurrent IMAP connections are you using?

1, single-threaded. I start testing very "simple", and work up to multi-threaded/multi-user slams once I establish a functional baseline.

I "preload" all of the imported mailboxes into RAM via cat * > /dev/null to reduce the "read cost" to near zero.

> A load below 1 indeed sounds suspicious.

I should have obtained mysql process status, but I only saw one mysqld worker process during the operation.

Should I configure innodb differently?

Reindl Harald wrote:

> if it is a option for you to upgrade mysql to 5.5
> i would strongly recommend this, our dbmail.machines
> became a hughe perofrmance boost against 5.1.x

I'll try this on the bare metal system if I don't see much improvement, thanks Reindl. I'll give your my.cnf settings a try as well, much appreciated.

> That number is about retrieval. And then only on POP3. And a very old
> number (dbmail-1.0?).
>
> I'd be curious about more up-to-date numbers.

I'd be more than happy to provide some once I've gotten things working properly.

Do you have a performance testing framework or set of scripts? I have the ImapTest suite, but that's more for validating compliance with the imap4r1 spec than performance. I know about MSTONE http://mstone.sourceforge.net/ but a tool that targets areas that developers suspect/worry are weak would seen sensible.

> Also, I'm not at all familiar with IO performance on VMWare, but on Xen
> IO really lags behind bare-iron databases. And thats no joke.

I have the pvscsi, vmxnet3, and other "driver helpers", and support for a paravirtualised kernel/timers, etc running and actually see very good I/O performance.

I know hdparm -t -T isn't a real disk I/O performance tool, but:

/dev/sda: <-- /tmp lives here
Timing cached reads: 7902 MB in 1.97 seconds = 4005.51 MB/sec
Timing buffered disk reads: 182 MB in 3.03 seconds = 60.02 MB/sec

/dev/sdb: <-- this is the MySQL spindle
Timing cached reads: 7702 MB in 1.97 seconds = 3901.96 MB/sec
Timing buffered disk reads: 236 MB in 3.00 seconds = 78.63 MB/sec

These are 3-4 year old Samsung 500GB drives, and this performance is about right when compared to what the host experiences natively. They're all formatted with ext4 (implying has_journal, extent, huge_file, flex_bg, uninit_bg, dir_nlink, extra_isize, sparse_super, filetype, resize_inode, dir_index, ext_attr)

> I need to keep thread-concurrency at 1 (read:
> one) to keep throughput at acceptable levels with IMAP concurrencies
> around 50-100 connections, POP3 around 10-20, with an added 5 LMTP
> connections for insertions.

Yuck, are you serious? Maybe splitting the table across multiple spindles would help.

I can try a "bare iron" test machine today. What is your "blessed" or preferred Linux Distro where you do your development and regression testing? I have no distro religious issues, though being a control freak, I will usually run "home" to the minimalistic comfort of Slackware.

> That said: dbmail will never be able to achieve the kind of numbers you
> mention for dovecot or panda. The single-instance storage used for both
> mime-blobs and header values, and fully indexed headers prevent it.
> DBMail insertion is not about creating simple files in directories.

I appreciate this difference. I realise that with Dovecot's scheme, they defer indexing until a user does an IMAP SEARCH TEXT/BODY, making the user wait a bit for that initial search before updating the index. I suppose to be fair, I should fire off a SEARCH for a known-to-be-non-existent string before closing the mailbox. It'd be trivial to update the script to do this since I am worried I'm not comparing "apples to apples" here.

I appreciate all of the replies I've received so far, especially from one of the developers.

=R=

_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


dbmail-list at r

Aug 13, 2011, 8:14 PM

Post #6 of 19 (820 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/13/2011 10:57 AM, Paul J Stevens wrote:
> On 08/13/2011 01:10 PM, Robin Horforth wrote:
>
>> The import takes Dovecot (MailDir or mdbox format), or Panda IMAP (mix)
>> about six minutes to complete.
>
> I'll take your word for it.

As a follow-up. I added imap->search() after EACH append() on the same vmWare setup, which gives me the following import times on Dovecot 2.0.13 (SiS posix attachment storage, mdbox format):

real 45m42.943s
user 2m39.620s
sys 0m21.671s

222616 /mail/home/usertest
363724 /mail/attachments/
586340 total

This reflects a full-text search indexing configuration of "fts_squat = partial=4 full=10", which provides for IMAP4r1 complianet substring searching in both headers + body areas.

The gap's closed considerably of course, and I feel this is closer to an equal playing field in the way of comparing append/insertion performance. But the difference in performance is still very large. Sure, I expect to "pay" something for using SQL and a relatively higher latency backing store - but I didn't realise it'd be this much.

As a sanity check to ensure the searches are being properly accelerated by Dovecot's FTS indexing, I performed a quick set of IMAP SEARCH TEXT commands against some of the larger mailboxes which look at "not found", "all found", and substring-stressor "moderate" result sets for comparison purposes:

Searching INBOX2010 #msgs = 4729 (Cold Indexing Time=14.784021s)

[NOFIND] Time=0.02358, matches=0
[date] Time=0.009816, matches=4729
[here] Time=0.006297, matches=2644

Searching INBOX2006 #msgs = 4149 (Cold Indexing Time=7.055049s)
[NOFIND] Time=0.020683, matches=0
[date] Time=0.008533, matches=4149
[here] Time=0.004641, matches=1734

With all FTS functionality turned off:

Searching INBOX2010 #msgs = 4729
[NOFIND] Time=6.815283, matches=0
[date] Time=0.494466, matches=4729
[here] Time=5.035462, matches=2643

Searching INBOX2006 #msgs = 4149
[NOFIND] Time=2.721383, matches=0
[date] Time=0.374643, matches=4149
[here] Time=2.156622, matches=1734

IMAP-compliant SEARCHes are indeed benefiting from the fts_squat option.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


paul at nfg

Aug 15, 2011, 12:21 AM

Post #7 of 19 (805 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 08/14/2011 04:50 AM, Robin Horforth wrote:
> I "preload" all of the imported mailboxes into RAM via cat * >
> /dev/null to reduce the "read cost" to near zero.

I don't think reading the raw mbox files will pose much of a bottleneck.

>> A load below 1 indeed sounds suspicious.
>
> I should have obtained mysql process status, but I only saw one
> mysqld worker process during the operation.

Mysqld will never show more than one worker process. It's
multi-threaded, not forking. mysqladmin processlist should be more
informative.

> Should I configure innodb differently?

Doubtful. Again, the low system load seems to indicate low concurrency.
Message insertion is quite intensive on the database. Lots of inserts
going on on multiple tables.


> Do you have a performance testing framework or set of scripts? I
> have the ImapTest suite, but that's more for validating compliance
> with the imap4r1 spec than performance. I know about MSTONE
> http://mstone.sourceforge.net/ but a tool that targets areas that
> developers suspect/worry are weak would seen sensible.

There's all the stuff in test-scripts/. But most of that is for
regressions, not load. Only real load-tester is testlmtp.py. The torture
script was for finding leaks in 2.2. It would need serious updating.

I've played with mstone, but never got around to driving development
based on it's result.

imaptest is another story. That one has helped a *lot* making dbmail a
lot more compliant. All Hail Timo!

>> I need to keep thread-concurrency at 1 (read: one) to keep
>> throughput at acceptable levels with IMAP concurrencies around
>> 50-100 connections, POP3 around 10-20, with an added 5 LMTP
>> connections for insertions.
>
> Yuck, are you serious? Maybe splitting the table across multiple
> spindles would help.

Maybe. But above situation is just my emergency backup when my main
bare-iron server goes down (flaky sata backplane). I also suspect that
mysql/innodb is not quite mvcc since I've noticed a lot of read-locks
blocking on write locks, leading to stampeding herd problems on slow
disks: xen host using local sata storage.

> I can try a "bare iron" test machine today. What is your "blessed"
> or preferred Linux Distro where you do your development and
> regression testing? I have no distro religious issues, though being
> a control freak, I will usually run "home" to the minimalistic
> comfort of Slackware.

I do all development on ubuntu desktops. Main servers run debian. But I
don't think it should matter.

>> That said: dbmail will never be able to achieve the kind of numbers
>> you mention for dovecot or panda. The single-instance storage used
>> for both mime-blobs and header values, and fully indexed headers
>> prevent it. DBMail insertion is not about creating simple files in
>> directories.
>
> I appreciate this difference. I realise that with Dovecot's scheme,
> they defer indexing until a user does an IMAP SEARCH TEXT/BODY,
> making the user wait a bit for that initial search before updating
> the index.

Maybe dbmail should have done the same when I first did the header
caching tables.... Deferred indexing would speed things up a lot.


--
________________________________________________________________
Paul J Stevens pjstevns @ gmail, twitter, skype, linkedin

* Premium Hosting Services and Web Application Consultancy *

www.nfg.nl/info [at] nfg/+31.85.877.99.97
________________________________________________________________
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


dbmail-list at r

Aug 15, 2011, 3:16 AM

Post #8 of 19 (803 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

> I don't think reading the raw mbox files will pose much of a bottleneck.

I know, I just wanted to reduce that to near zero impact from run to run.

> Mysqld will never show more than one worker process. It's
> multi-threaded, not forking. mysqladmin processlist should be more
> informative.

Gotcha.

I'll try PostgreSQL 9.0.x next. My preliminary tests show it to be much faster on these sorts of operations.

> Doubtful. Again, the low system load seems to indicate low concurrency.
> Message insertion is quite intensive on the database. Lots of inserts
> going on on multiple tables.

Understood. Would I be correct in assuming that having one or more simultaneous APPEND processes running at the same time would show more effective messages/second mailbox insertion performance? I'm happy to extend this to 2, 4, 8, 100 APPEND processes, if it would expose different performance limitations (in ANY of my tested products).

> I do all development on ubuntu desktops. Main servers run debian. But I
> don't think it should matter.

I'll try the bare iron tests on Debian then.

Incidentally, the IMAP SEARCH TEXT results using dbmail+mysql are a bit odd.

Searching INBOX #msgs = 24714
[NOFIND] Time=2.072423, matches=24714 <--- this should be zero *BUG*
[date] Time=2.07519, matches=24714 <--- this is correct
[here] Time=2.072075, matches=24714 <--- this should be about 30% of total # of msgs *BUG*

Does dbmail break IMAP SEARCH TEXT (i.e., search both body + headers)? Is this a result of relying on MySQL's search algorithms in text-like fields? I'm still puzzled, because I can't believe that 'here' appears in EVERY email. It looks like dbmail's returning EVERY email on a SEARCH TEXT. This is not correct operation.

When I alter the search to use "FROM" as the key instead of "TEXT", the results are more discriminating and meet expectations.

Searching INBOX #msgs = 24714
[NOFIND] Time=2.161049, matches=0
[james] Time=2.273255, matches=1049
[here] Time=2.165406, matches=2

Not that it matters, but it's much slower than Dovecot's fts_squat for substring searches.

Dovecot's fts_squat IMAP SEARCH TEXT results are:

Searching INBOX #msgs = 55731
[Updating Index] Time=78.184637 (66% of the mailbox unindexed at start)
[NOFIND] Time=0.045654, matches=0
[date] Time=0.13364, matches=55731
[here] Time=0.069091, matches=24663

The matched sets are correct, incidentally.

OK, so MySQL isn't the fastest writer in the toolbox, but I expected a bit more from it on searches.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


paul at nfg

Aug 15, 2011, 5:03 AM

Post #9 of 19 (798 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 08/15/2011 12:16 PM, Robin Horforth wrote:
> I'll try PostgreSQL 9.0.x next. My preliminary tests show it to be
> much faster on these sorts of operations.

Very interesting! Keep us posted.

>> Doubtful. Again, the low system load seems to indicate low
>> concurrency. Message insertion is quite intensive on the database.
>> Lots of inserts going on on multiple tables.
>
> Understood. Would I be correct in assuming that having one or more
> simultaneous APPEND processes running at the same time would show
> more effective messages/second mailbox insertion performance? I'm
> happy to extend this to 2, 4, 8, 100 APPEND processes, if it would
> expose different performance limitations (in ANY of my tested
> products).

That would make sense. APPEND is fully threaded, and should quickly
saturate the connection pool. Check and tune max_db_connections in
dbmail.conf to find the sweet spot for the backend server used. But I
have no experience with how this would scale for this particular scenario.

>
>> I do all development on ubuntu desktops. Main servers run debian.
>> But I don't think it should matter.
>
> I'll try the bare iron tests on Debian then.
>
> Incidentally, the IMAP SEARCH TEXT results using dbmail+mysql are a
> bit odd.
>
> Searching INBOX #msgs = 24714 [NOFIND] Time=2.072423, matches=24714
> <--- this should be zero *BUG* [date] Time=2.07519, matches=24714
> <--- this is correct [here] Time=2.072075, matches=24714 <--- this
> should be about 30% of total # of msgs *BUG*
>
> Does dbmail break IMAP SEARCH TEXT (i.e., search both body +
> headers)? Is this a result of relying on MySQL's search algorithms
> in text-like fields? I'm still puzzled, because I can't believe that
> 'here' appears in EVERY email. It looks like dbmail's returning
> EVERY email on a SEARCH TEXT. This is not correct operation.

Indeed. Looks like search is broken. I think there's already a bugreport
for that on the tracker.

That said: SEARCH TEXT is done very poorly at the moment: no FTI, full
table scans using HAVING LIKE queries. FTI is sorely needed here.

> When I alter the search to use "FROM" as the key instead of "TEXT",
> the results are more discriminating and meet expectations.

Ok.

>
> OK, so MySQL isn't the fastest writer in the toolbox, but I expected
> a bit more from it on searches.

Blame me.


--
________________________________________________________________
Paul J Stevens pjstevns @ gmail, twitter, skype, linkedin

* Premium Hosting Services and Web Application Consultancy *

www.nfg.nl/info [at] nfg/+31.85.877.99.97
________________________________________________________________
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


tokul at users

Aug 15, 2011, 7:29 AM

Post #10 of 19 (806 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

2011.08.14 05:50 Robin Horforth rašė:
> Do you have a performance testing framework or set of scripts? I have
> the ImapTest suite, but that's more for validating compliance with the
> imap4r1 spec than performance.

Dovecot's imaptest utility does not validate compliance with imap4rev1.
Some tests performed by utility depend on optional imap4rev1 features and
assume that those features are required. For example, utility assumes that
custom flags are required and tries to use them even when server does not
declare support of custom flags. IMAP server's failure to comply with
untagged OK response format does not generate any non-compliance report.
It generates non-descriptive errors in utility.

Utility tests imap4rev1 and some IMAP extensions. It is more like shared
imap mailbox and stress testing tool. Test results might indicate
non-compliance, but any non-compliance report must be manually verified.

--
Tomas




_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


dbmail-list at r

Aug 15, 2011, 8:52 PM

Post #11 of 19 (799 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/15/2011 5:03 AM, Paul J Stevens wrote:
> On 08/15/2011 12:16 PM, Robin Horforth wrote:
>> I'll try PostgreSQL 9.0.x next. My preliminary tests show it to be
>> much faster on these sorts of operations.
>
> Very interesting! Keep us posted.

Wow.

Some disclaimers.

1) I'm brain dead with PostgreSQL.
2) I've set it up and deployed only about 4 systems with it.
3) I compiled mine from 9.0.4 "stock" sources, and installed it stone
cold stock. I don't know anything about PostgreSQL (see #1), so I didn't
touch *ANY* tuning values in the configuration.

Same VM as before that broke MySQL's back.

18577 msgs (560MB) imported in:

real 65m7.098s
user 2m27.978s
sys 0m16.982s

$ ~/searchtest
Searching INBOX #msgs = 18577
[Updating Index] Time=1.418382 (COLD search)
[NOFIND] Time=0.489393, matches=18577
[date] Time=0.315028, matches=18577
[here] Time=0.323558, matches=18577

The import was about 3 *TIMES* faster.

OK, I'm always suspicious when I see differences as large as these.
Maybe I need to upgrade my MySQL or recompile it with targeted tunings?

I'll try MySQL 5.5 next.

This new result changes things a bit.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


dbmail-list at r

Aug 16, 2011, 2:39 AM

Post #12 of 19 (800 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/15/2011 5:03 AM, Paul J Stevens wrote:
> On 08/15/2011 12:16 PM, Robin Horforth wrote:
>> I'll try PostgreSQL 9.0.x next. My preliminary tests show it to be
>> much faster on these sorts of operations.
>
> Very interesting! Keep us posted.

Interesting.

When I run a duplicate import pass (adding the same emails to the INBOX,
hence, doubling the size of the mailbox), I see a worrying performance
reduction.

(Same VM setup, PostgreSQL 9.0.4 with stock configuration):

18577 mesgs imported to INBOX:

real 131m17.473s
user 2m15.263s
sys 0m30.752s

(This is about twice as long as the first 18577 messages)

$ ~/searchtest

Searching INBOX #msgs = 37154
[Updating Index] Time=1.734378 (cold search)
[NOFIND] Time=0.811997, matches=37154
[date] Time=0.677009, matches=37154
[here] Time=0.683351, matches=37154

This still makes MySQL 5.1 look very poor indeed.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


dbmail-list at r

Aug 23, 2011, 1:37 PM

Post #13 of 19 (753 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/13/2011 9:01 AM, Reindl Harald wrote:
> if it is a option for you to upgrade mysql to 5.5
> i would strongly recommend this, our dbmail.machines
> became a hughe perofrmance boost against 5.1.x

OK, I finally got around to trying MySQL 5.5 as a final test.

I'm still using the MySQL 5.1 client libraries, incidentally.

Importing 18577 messages (560MB) into a single folder:

real 320m32.330s
user 2m25.413s
sys 0m21.914s

MySQL used 217 minutes' worth of CPU during this process.

Searching Tests #msgs = 18577
[Updating Index] Time=2.29237, matches=18577
[NOFIND] Time=2.102401, matches=18577
[date] Time=1.958493, matches=18577
[here] Time=2.016579, matches=18577

Server settings (same 3GB HighMem + 2G/2G split virtual machine):

skip-external-locking
key_buffer_size = 256M
max_allowed_packet = 16M
table_open_cache = 256
sort_buffer_size = 32M
read_buffer_size = 32M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 128M
thread_cache_size = 8
query_cache_size= 128M
thread_concurrency = 4
innodb_data_home_dir = /mail/db
innodb_data_file_path = ibdata1:500M:autoextend
innodb_log_group_home_dir = /mail/db
innodb_buffer_pool_size = 512M
innodb_additional_mem_pool_size = 40M
innodb_log_file_size = 128M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50

Yikes.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


h.reindl at thelounge

Aug 23, 2011, 4:32 PM

Post #14 of 19 (781 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

you are knowing that most buffer-sizes are per connection
and so if you have 100 connections your machine will consume
3200 MB Memory for each per-connection buffer?

try the settings below and i bet your machine will be much faster
your "innodb_log_buffer_size" is way too small and instead wasting
memory for per-connection buffers use it for "innodb_buffer_pool_size"
which should be in a perfect world as big as the whole database

some of this settings made the difference creeping to flying with a
15 GB dbmail-database on a virtual machine with 10 GB RAM and in
summary MySQL 5.5 with all this settings is really really fast

mysql 5.5-optimizings if there is enough memory
innodb_buffer_pool_size = 5120M
innodb_buffer_pool_instances = 5
innodb_purge_threads = 1
innodb_max_purge_lag = 200000
innodb_thread_concurrency = 32
innodb_thread_sleep_delay = 10
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_io_capacity = 600
transaction-isolation = READ-COMMITTED

"innodb_io_capacity" with 600 asumes really fast disks!
_______________

however, this are recommended changes for 5.1

i hope i have stripped all my mysql-5.5-optimizings, if not take a
look in the error-log if it fails to start for unknown variables

key_buffer_size = 256M
sort_buffer_size = 320K
read_rnd_buffer_size = 256K
join_buffer_size = 320K
read_buffer_size = 128K
preload_buffer_size = 128K
myisam_sort_buffer_size = 128M
innodb_buffer_pool_size = 512M
innodb_max_dirty_pages_pct = 60
innodb_additional_mem_pool_size = 64M
innodb_log_file_size = 128M
innodb_log_buffer_size = 256M
innodb_thread_concurrency = 8
innodb_thread_sleep_delay = 10
innodb_flush_log_at_trx_commit = 2
innodb_support_xa = 1
innodb_lock_wait_timeout = 50
innodb_table_locks = 0


Am 23.08.2011 22:37, schrieb Robin Horforth:
> skip-external-locking
> key_buffer_size = 256M
> max_allowed_packet = 16M
> table_open_cache = 256
> sort_buffer_size = 32M
> read_buffer_size = 32M
> read_rnd_buffer_size = 4M
> myisam_sort_buffer_size = 128M
> thread_cache_size = 8
> query_cache_size= 128M
> thread_concurrency = 4
> innodb_data_home_dir = /mail/db
> innodb_data_file_path = ibdata1:500M:autoextend
> innodb_log_group_home_dir = /mail/db
> innodb_buffer_pool_size = 512M
> innodb_additional_mem_pool_size = 40M
> innodb_log_file_size = 128M
> innodb_log_buffer_size = 8M
> innodb_flush_log_at_trx_commit = 1
> innodb_lock_wait_timeout = 50
Attachments: signature.asc (0.26 KB)


dbmail-list at r

Aug 23, 2011, 4:51 PM

Post #15 of 19 (756 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/23/2011 4:32 PM, Reindl Harald wrote:
> you are knowing that most buffer-sizes are per connection
> and so if you have 100 connections your machine will consume
> 3200 MB Memory for each per-connection buffer?

I was doing a single user import + search validation test initially, so
that's fine.

I've tried your 5.5 recommendations, scaled down to my VM's RAM
allocation (3GB). I appreciate the help!

Do you build your MySql 5.5 from source, by the way? Since their
changeover to CMake, I've had problems getting the sources to build
completely. I suspect I need to read more documentation about how to
pass configuration options to the CMake build tools.

> try the settings below and i bet your machine will be much faster
> your "innodb_log_buffer_size" is way too small and instead wasting
> memory for per-connection buffers use it for "innodb_buffer_pool_size"
> which should be in a perfect world as big as the whole database

Thanks for these, Reindl! I have the database files located on ext4
formatted to *NOT* use a journal, as I figured to give this the best
chance to perform well.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


h.reindl at thelounge

Aug 23, 2011, 5:10 PM

Post #16 of 19 (758 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

Am 24.08.2011 01:51, schrieb Robin Horforth:
> On 8/23/2011 4:32 PM, Reindl Harald wrote:
>> you are knowing that most buffer-sizes are per connection
>> and so if you have 100 connections your machine will consume
>> 3200 MB Memory for each per-connection buffer?
>
> I was doing a single user import + search validation test initially, so that's fine.

yes but depending of max allowed connections your machine will crash
and useless big buffers are slowing down things because the memory
must be allocated

> I've tried your 5.5 recommendations, scaled down to my VM's RAM allocation (3GB). I appreciate the help!

is it faster now?
the settings below was also important, i tried only to split because i
was not sure which version you are using

> Do you build your MySql 5.5 from source, by the way?

yes, with a lot of optimizing for Core2 and SSE 4.1, -O6 and so on

> Since their changeover to CMake, I've had problems getting
> the sources to build completely.

i take the orioginal fedora srpms, make my changes for params and compiler
flags and build a natvie RPM package - raw source builds are bad because
with rpmbuild i have everytime a fresh buildroot and on the other hand
i need the packages for 23 machines on the same VMware-Cluster

> I suspect I need to read more documentation about how to pass configuration
> options to the CMake build tools.

i read docs too, but i like it more to optimize the builds of my distribution because
they usually knwoing what they are doing and including often a lot of patches, many of
them for the build-process depending on the current GCC version

> Thanks for these, Reindl! I have the database files located on ext4 formatted to *NOT* use a journal,
> as I figured to give this the best chance to perform well

boah do not do this in production
below my mount-options and i thought they are a little on the dark side :-)

"barrier=0" is ok here because there are two UPS sytems and the SAN-Storage
is battery backed too, means the buffers are safe at every time

defaults,data=writeback,commit=60,barrier=0,nobh,delalloc,async,noatime,nodiratime,noacl,nouser_xattr,noexec,inode_readahead_blks=64
Attachments: signature.asc (0.26 KB)


dbmail-list at r

Aug 23, 2011, 9:19 PM

Post #17 of 19 (756 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

> is it faster now?
> the settings below was also important, i tried only to split because i
> was not sure which version you are using

Oh yes, big improvement!

18577 msgs (560MB) imported in:

real 107m55.162s
user 1m57.277s
sys 0m16.965s

That's about 1/4th to 1/3rd the times in all of my MySQL testing.

The key was shifting the memory cache allocations to the innodb section
of things rather than the top part of [mysqld].

My my.cnf:

skip-external-locking
key_buffer_size = 96M
max_allowed_packet = 16M
table_open_cache = 256
sort_buffer_size = 32M
read_buffer_size = 32M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 128M
thread_cache_size = 8
query_cache_size= 64M
thread_concurrency = 4

innodb_data_file_path = ibdata1:500M:autoextend
innodb_log_group_home_dir = /mail/db
innodb_buffer_pool_size = 1024M
innodb_additional_mem_pool_size = 64M
innodb_log_file_size = 512M
innodb_log_buffer_size = 256M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50


innodb_buffer_pool_size = 1024M
innodb_buffer_pool_instances = 5
innodb_purge_threads = 1
innodb_max_purge_lag = 200000
innodb_thread_concurrency = 32
innodb_thread_sleep_delay = 10
innodb_read_io_threads = 8
innodb_write_io_threads = 8
transaction-isolation = READ-COMMITTED

My server is running a "Linux Generic x86" build: mysql-5.5.15-linux2.6-i686

> yes, with a lot of optimizing for Core2 and SSE 4.1, -O6 and so on

Sounds what I prefer to do when I can... :(

At times, the Gentoo portage system is very appealing, but they lack a
MySQL 5.5 ebuild that's stable.

> i read docs too, but i like it more to optimize the builds of my distribution because
> they usually knwoing what they are doing and including often a lot of patches, many of
> them for the build-process depending on the current GCC version

Hrm, maybe I should try to refactor an SRPM to do it for my Slackware
systems.

> boah do not do this in production
> below my mount-options and i thought they are a little on the dark side :-)

Yes, I know it's risky, but for "flat out" performance testing just to
see where the really hard corner edge cases are, I was getting desperate.

> "barrier=0" is ok here because there are two UPS sytems and the SAN-Storage
> is battery backed too, means the buffers are safe at every time

Yeah, I would agree with your reasoning here. With BBU-backed RAID,
it's probably safe to disregard write barriers.

Thank you for your help, Reindl. You've been very generous with your
time and attention, and I appreciate it.

=R=

_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


dbmail-list at r

Aug 24, 2011, 2:47 AM

Post #18 of 19 (750 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

On 8/23/2011 9:19 PM, I wrote:
>> is it faster now?
>> the settings below was also important, i tried only to split because i
>> was not sure which version you are using
>
> Oh yes, big improvement!
>
> 18577 msgs (560MB) imported in:
>
> real 107m55.162s
> user 1m57.277s
> sys 0m16.965s

*GACK*!

I spoke too soon. I didn't notice that the import task was killed about
56% of the way through the process. As you warned me, I'd run out of
memory. (The PERL module has an odd memory leak/bug where its memory
usage only ascends as it processes emails.)

Given that the process seems to get slower as more emails are APPENDED
to a given mailbox, I'm not sure there is all that much of an
improvement in speed. When the script finishes properly, I'll provide
an update.

My apologies for jumping the gun and spreading bad data to the list.

=R=
_______________________________________________
DBmail mailing list
DBmail [at] dbmail
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


h.reindl at thelounge

Aug 24, 2011, 5:05 AM

Post #19 of 19 (749 views)
Permalink
Re: Very slow imap4 performance seen when importing mail. Possible config problem? [In reply to]

Am 24.08.2011 11:47, schrieb Robin Horforth:
> On 8/23/2011 9:19 PM, I wrote:
>>> is it faster now?
>>> the settings below was also important, i tried only to split because i
>>> was not sure which version you are using
>>
>> Oh yes, big improvement!
>>
>> 18577 msgs (560MB) imported in:
>>
>> real 107m55.162s
>> user 1m57.277s
>> sys 0m16.965s
>
> *GACK*!
>
> I spoke too soon. I didn't notice that the import task was killed about 56% of the way through the process. As
> you warned me, I'd run out of memory. (The PERL module has an odd memory leak/bug where its memory usage only
> ascends as it processes emails.)
>
> Given that the process seems to get slower as more emails are APPENDED to a given mailbox, I'm not sure there is
> all that much of an improvement in speed. When the script finishes properly, I'll provide an update.
>
> My apologies for jumping the gun and spreading bad data to the list

no problem

3 GB Memory on your VM is a little bit small

having in mind that each connection needs memory for imapd/pop3d and per-connection
buffers additional to the OS himself, postfix and the innodb-buffer is really important
i can not imagine how i would deal with 3 GB because this hurts if there are many
open connections or as in your case something allocates memory too

limit the maximum of imap-connections is often not possible depending on the number
of users - this will be much better in dbmail 3.x by using connection-pooling but
with 2.x you have for every imap-connection a imapd-process and a db-connection with
all its overheads (don't forget mysql-connections from postfix/sasl)

in the 3 years using dbmail now i learned that as much RAM as possible is the only
way to get this really fast running
Attachments: signature.asc (0.26 KB)

DBMail users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.