Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DBMail: users

OT: DBMail Administrator (DBMA) Performance Fix

 

 

DBMail users RSS feed   Index | Next | Previous | View Threaded


wally at pacific

Apr 3, 2009, 12:12 AM

Post #1 of 11 (923 views)
Permalink
OT: DBMail Administrator (DBMA) Performance Fix

I have been using DBMA to administer DBMail v2.2.10
On the home page of DBMA web interface, it displays only the top part of the page.
I discovered the issue was due to a slow query in MySQL, after checking the slow-query-log in MySQL.

The slow query is:
SELECT COUNT(*) FROM dbmail_messageblks;

Running this slow query took 138 seconds (2 min 18.09 sec)
SELECT COUNT(*) FROM dbmail_messageblks;
+----------+
| COUNT(*) |
+----------+
| 262788 |
+----------+
1 row in set (2 min 18.09 sec)

After optimizing the SQL, it took 0.27 seconds.
SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index);
+----------+
| COUNT(*) |
+----------+
| 262796 |
+----------+
1 row in set (0.27 sec)

The diff below fix this performance issue for DBMA.cgi
6109c6109
< $dbh->prepare("SELECT COUNT(*) FROM $dbmail_messageblks_table use index(physmessage_id_index)");
---
> $dbh->prepare("SELECT COUNT(*) FROM $dbmail_messageblks_table");


HTH other users of DBMA.

BTW, I posted this here because I can't find the forum/mailing list for DBMA.
And what's the best admin interface for DBMail?

--
Regards,
Wallace
_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


michael.monnerie at is

Apr 3, 2009, 3:30 AM

Post #2 of 11 (879 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

On Freitag 03 April 2009 Wallace Tan wrote:
> SELECT COUNT(*) FROM dbmail_messageblks;
> 1 row in set (2 min 18.09 sec)
>
> After optimizing the SQL, it took 0.27 seconds.
> SELECT COUNT(*) FROM dbmail_messageblks use
> index(physmessage_id_index);

The question is: Why is MySQL so stupid not to use the index? That
should be done automatically by the DBMS, that's its job. I would oppose
against changing the query just because MySQL has a bug. Maybe you use a
version that's known to be instable?

mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


wally at pacific

Apr 3, 2009, 4:29 AM

Post #3 of 11 (878 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

Michael Monnerie wrote:
> The question is: Why is MySQL so stupid not to use the index? That
> should be done automatically by the DBMS, that's its job. I would oppose
> against changing the query just because MySQL has a bug. Maybe you use a
> version that's known to be instable?

I am using MySQL v5.0.77

MySQL IS using PRIMARY index for the slow query.

However, after reading comment 19 at:
http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/
I got the idea to use another index for the COUNT.

Any MySQL experts can explain this performance gap?


> SHOW CREATE TABLE dbmail_messageblks\G
*************************** 1. row ***************************
Table: dbmail_messageblks
Create Table: CREATE TABLE `dbmail_messageblks` (
`messageblk_idnr` bigint(21) NOT NULL auto_increment,
`physmessage_id` bigint(21) NOT NULL default '0',
`messageblk` longblob NOT NULL,
`blocksize` bigint(21) NOT NULL default '0',
`is_header` tinyint(1) NOT NULL default '0',
PRIMARY KEY (`messageblk_idnr`),
KEY `physmessage_id_index` (`physmessage_id`),
KEY `physmessage_id_is_header_index` (`physmessage_id`,`is_header`),
CONSTRAINT `dbmail_messageblks_ibfk_1` FOREIGN KEY (`physmessage_id`) REFERENCES `dbmail_physmessage` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=602519 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

> EXPLAIN SELECT COUNT(*) FROM dbmail_messageblks\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 1930308
Extra: Using index
1 row in set (0.00 sec)

> EXPLAIN SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: physmessage_id_index
key_len: 8
ref: NULL
rows: 1930310
Extra: Using index
1 row in set (0.00 sec)

--
Regards,
Wallace

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


wally at pacific

Apr 3, 2009, 5:06 AM

Post #4 of 11 (876 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

Wallace Tan wrote:
> I am using MySQL v5.0.77
>
> MySQL IS using PRIMARY index for the slow query.
>
> However, after reading comment 19 at:
> http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/
> I got the idea to use another index for the COUNT.
>
> Any MySQL experts can explain this performance gap?

From http://capttofu.livejournal.com/12570.html

"InnoDB stores data in primary key order.
If you don't specify a primary key, innodb creates one internally.
InnoDB uses a clustered index, which means every index is stored with the primary key -- so be careful when making primary keys on InnoDB tables that are long.
Clustered indexes give good performance for writes as well as selecting data by index.
They are _slow_ with count(*) because:

* InnoDB doesn't maintain # rows in the storage engine
* Clustered indexes are slow when you perform count(*) because it is a count across the primary key, that operation has to traverse each index and data node. The way to get around this is to use

select count(1) from t1;

Or

select count(<some other indexed column>) from t1;"

--
Regards,
Wallace
M:94500905
_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


michael.monnerie at is

Apr 3, 2009, 7:25 AM

Post #5 of 11 (876 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

On Freitag 03 April 2009 Wallace Tan wrote:
> select count(1) from t1;

That would have been my next question. I've spoken once to Paul, because
dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it
seems MySQL would have a performance boost using count(1).

Could you please try:
1) first, SELECT COUNT(1) FROM dbmail_messageblks;
and afterwards
2) SELECT COUNT(*) FROM dbmail_messageblks;

The order is important: After the first select(), the table will be
cached, so the 2nd query will be faster. That, BTW, is part of the
explanation why your 2nd query was much faster than the 1st.
Still, count(1) should be faster than count(*) I would expect from the
thread you posted. I do not have a MySQL db with enough data to test
around. We're using PostgreSQL because things like that happen to exist
in MySQL since years, and I don't need a DBMS where I have to think for
it. I wonder why the devs don't manage to work around those problems.
But no flames please, everybody should use what they prefer.

mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


wally at pacific

Apr 3, 2009, 11:04 AM

Post #6 of 11 (877 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

Michael Monnerie wrote:
> On Freitag 03 April 2009 Wallace Tan wrote:
>> select count(1) from t1;
>
> That would have been my next question. I've spoken once to Paul, because
> dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it
> seems MySQL would have a performance boost using count(1).
>
> Could you please try:
> 1) first, SELECT COUNT(1) FROM dbmail_messageblks;
> and afterwards
> 2) SELECT COUNT(*) FROM dbmail_messageblks;
>
> The order is important: After the first select(), the table will be
> cached, so the 2nd query will be faster. That, BTW, is part of the
> explanation why your 2nd query was much faster than the 1st.
> Still, count(1) should be faster than count(*) I would expect from the
> thread you posted. I do not have a MySQL db with enough data to test
> around. We're using PostgreSQL because things like that happen to exist
> in MySQL since years, and I don't need a DBMS where I have to think for
> it. I wonder why the devs don't manage to work around those problems.
> But no flames please, everybody should use what they prefer.
>
> mfg zmi

In MySQL (using InnoDB engine) there is no difference for between COUNT(*) or COUNT(1)
because it is 'optimized' to use the PRIMARY index.

The InnoDB PRIMARY key is a clustered index. See previous post.

If I understand this correctly, the PRIMARY key (clustered index) is THE problem.
So the only viable solution is to force the query to use a non-clustered index.


> SELECT COUNT(1) FROM dbmail_messageblks;
+----------+
| COUNT(1) |
+----------+
| 263339 |
+----------+
1 row in set (2 min 30.44 sec)

> SELECT COUNT(*) FROM dbmail_messageblks;
+----------+
| COUNT(*) |
+----------+
| 263357 |
+----------+
1 row in set (2 min 25.91 sec)

> EXPLAIN EXTENDED SELECT COUNT(1) FROM dbmail_messageblks\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 6574840
Extra: Using index

> EXPLAIN EXTENDED SELECT COUNT(*) FROM dbmail_messageblks\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 6574840
Extra: Using index




--
Regards,
Wallace
M:94500905
_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


wally at pacific

Apr 3, 2009, 11:13 AM

Post #7 of 11 (878 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

Michael Monnerie wrote:
> On Freitag 03 April 2009 Wallace Tan wrote:
>> select count(1) from t1;
>
> That would have been my next question. I've spoken once to Paul, because
> dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it
> seems MySQL would have a performance boost using count(1).
>
> Could you please try:
> 1) first, SELECT COUNT(1) FROM dbmail_messageblks;
> and afterwards
> 2) SELECT COUNT(*) FROM dbmail_messageblks;
>
> The order is important: After the first select(), the table will be
> cached, so the 2nd query will be faster. That, BTW, is part of the
> explanation why your 2nd query was much faster than the 1st.
> Still, count(1) should be faster than count(*) I would expect from the
> thread you posted. I do not have a MySQL db with enough data to test
> around. We're using PostgreSQL because things like that happen to exist
> in MySQL since years, and I don't need a DBMS where I have to think for
> it. I wonder why the devs don't manage to work around those problems.
> But no flames please, everybody should use what they prefer.
>
> mfg zmi

What the performance like for the same query using PostgreSQL?
I would consider PostgreSQL for my DBMail store.
Thanks!

--
Regards,
Wallace

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


josh at agliodbs

Apr 3, 2009, 11:14 AM

Post #8 of 11 (883 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

> In MySQL (using InnoDB engine) there is no difference for between COUNT(*) or COUNT(1)
> because it is 'optimized' to use the PRIMARY index.

FWIW, there's no difference in PostgreSQL either.

--Josh

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


michael.monnerie at is

Apr 3, 2009, 1:12 PM

Post #9 of 11 (878 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

On Freitag 03 April 2009 Josh Berkus wrote:
> FWIW, there's no difference in PostgreSQL either.

Yes, but the other way round: In PostgreSQL it's always *FAST*.
So, a little bit of difference ;-)

EXPLAIN ANALYZE SELECT COUNT(*) FROM dbmail_messageblks;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=58336.22..58336.23 rows=1 width=0) (actual
time=1574.620..1574.621 rows=1 loops=1)
-> Seq Scan on dbmail_messageblks (cost=0.00..57162.37 rows=469537
width=0) (actual time=0.056..1019.980 rows=469683 loops=1)
Total runtime: 1574.676 ms

And that's just my home server, a XEN vm within a normal PC.

mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail


ljackson at jjcons

Apr 3, 2009, 6:39 PM

Post #10 of 11 (877 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

All,
Been a long time since I posted anything but I wanted to give some number
for the group as well.

Just an FYI it also maybe hardware and version dependent, that is to say
that the performance measurements are not cpu bound therefor it may be IO
that is saturated which will happen on either mysql or postgresql but as
show below on MySQL 5.0.67-community on my customer's production dbmail
cluster sql server

Dell 6850 quad 3.2Ghz cpus
2.6.18-92.1.18.el5.centos.plus #1 SMP Wed Nov 26 07:28:20 EST 2008 x86_64
x86_64 x86_64 GNU/Linux
CentOS 5.2
16G ram
Dell MD3000 sas raid5 array 200G storage over 13 drives at 15k rpm using
LVM2 and ext3
InnoDB, binary logging turned on

+----------+
| COUNT(1) |
+----------+
| 997728 |
+----------+
1 row in set (0.58 sec)

+----------+
| COUNT(1) |
+----------+
| 997728 |
+----------+
1 row in set (0.00 sec)

+----------+
| COUNT(*) |
+----------+
| 997728 |
+----------+
1 row in set (0.57 sec)

+----------+
| COUNT(*) |
+----------+
| 997728 |
+----------+
1 row in set (0.00 sec)

However it should be said that this box is very large and the customer never
fully utilized it so it is unloaded being Friday night at 9pm and he has
business customers but still it shows the time to load a similar dataset's
indexs into memory where mysql innodb has 8G of ram dedicated to it.

I would say that untill dbmail sql processes are CPU bound these kinds of
discussions on performance are not the most critical.

The slow query log on this box shows stuff like (taken with the fact that
this schema has been thru upgrades and maybe missing something, but I am
fairly certian it is up to date with indexs):

# Time: 090402 4:02:33
# User[at]Host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 3 Lock_time: 0 Rows_sent: 0 Rows_examined: 493594
SELECT p.id FROM dbmail_physmessage p LEFT JOIN dbmail_headervalue h
ON p.id= h.physmessage_id WHERE h.physmessage_id IS NULL;

# Time: 090403 4:02:24
# User[at]Host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 4 Lock_time: 0 Rows_sent: 0 Rows_examined: 994156
SELECT MIN(messageblk_idnr),MAX(is_header) FROM dbmail_messageblks GROUP BY
physmessage_id HAVING MAX(is_header)=0;

# Time: 090403 4:02:33
# User[at]Host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 5 Lock_time: 0 Rows_sent: 1 Rows_examined: 1022067
SELECT usr.user_idnr, sum(pm.messagesize), usr.curmail_size FROM
dbmail_users usr LEFT JOIN dbmail_mailboxes mbx ON mbx.owner_idnr =
usr.user_idnr LEFT JOIN dbmail_messages msg ON msg.mailbox_idnr =
mbx.mailbox_idnr LEFT JOIN dbmail_physmessage pm ON pm.id =
msg.physmessage_id AND msg.status < 2 GROUP BY usr.user_idnr,
usr.curmail_size HAVING ((SUM(pm.messagesize) <> usr.curmail_size) OR (NOT
(SUM(pm.messagesize) IS NOT NULL) AND usr.curmail_size <> 0));

# Time: 090403 4:02:37
# User[at]Host: mail[mail] @ dbmail00 [10.1.1.204]
# Query_time: 4 Lock_time: 0 Rows_sent: 0 Rows_examined: 1679722
SET timestamp=1238745757;
SELECT message_idnr FROM dbmail_messages m JOIN dbmail_physmessage p ON
m.physmessage_id=p.id JOIN dbmail_headervalue v ON
v.physmessage_id=p.idJOIN dbmail_headername n ON v.headername_id=
n.id WHERE m.mailbox_idnr=1816 AND n.headername IN
('resent-message-id','message-id') AND
v.headervalue='<20090403080233.C12443BF50[at]dbmail00.XXX>' AND p.internal_date
> NOW() - INTERVAL 3 DAY;

the last one seems to be the common but these three are consistant this may
be a better place to look in opimizations, I was under the impression that
any time you use a HAVING clause it will cause a table scan?

Also for interest of comparison to the explain details below:

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 5428069
Extra: Using index
1 row in set, 1 warning (0.00 sec)

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 5428069
Extra: Using index
1 row in set, 1 warning (0.00 sec)


On Fri, Apr 3, 2009 at 2:04 PM, Wallace Tan <wally[at]pacific.net.sg> wrote:

> Michael Monnerie wrote:
> > On Freitag 03 April 2009 Wallace Tan wrote:
> >> select count(1) from t1;
> >
> > That would have been my next question. I've spoken once to Paul, because
> > dbmail uses lots of count(*), but PostgreSQL optimizes this out. Now it
> > seems MySQL would have a performance boost using count(1).
> >
> > Could you please try:
> > 1) first, SELECT COUNT(1) FROM dbmail_messageblks;
> > and afterwards
> > 2) SELECT COUNT(*) FROM dbmail_messageblks;
> >
> > The order is important: After the first select(), the table will be
> > cached, so the 2nd query will be faster. That, BTW, is part of the
> > explanation why your 2nd query was much faster than the 1st.
> > Still, count(1) should be faster than count(*) I would expect from the
> > thread you posted. I do not have a MySQL db with enough data to test
> > around. We're using PostgreSQL because things like that happen to exist
> > in MySQL since years, and I don't need a DBMS where I have to think for
> > it. I wonder why the devs don't manage to work around those problems.
> > But no flames please, everybody should use what they prefer.
> >
> > mfg zmi
>
> In MySQL (using InnoDB engine) there is no difference for between COUNT(*)
> or COUNT(1)
> because it is 'optimized' to use the PRIMARY index.
>
> The InnoDB PRIMARY key is a clustered index. See previous post.
>
> If I understand this correctly, the PRIMARY key (clustered index) is THE
> problem.
> So the only viable solution is to force the query to use a non-clustered
> index.
>
>
> > SELECT COUNT(1) FROM dbmail_messageblks;
> +----------+
> | COUNT(1) |
> +----------+
> | 263339 |
> +----------+
> 1 row in set (2 min 30.44 sec)
>
> > SELECT COUNT(*) FROM dbmail_messageblks;
> +----------+
> | COUNT(*) |
> +----------+
> | 263357 |
> +----------+
> 1 row in set (2 min 25.91 sec)
>
> > EXPLAIN EXTENDED SELECT COUNT(1) FROM dbmail_messageblks\G
> *************************** 1. row ***************************
> id: 1
> select_type: SIMPLE
> table: dbmail_messageblks
> type: index
> possible_keys: NULL
> key: PRIMARY
> key_len: 8
> ref: NULL
> rows: 6574840
> Extra: Using index
>
> > EXPLAIN EXTENDED SELECT COUNT(*) FROM dbmail_messageblks\G
> *************************** 1. row ***************************
> id: 1
> select_type: SIMPLE
> table: dbmail_messageblks
> type: index
> possible_keys: NULL
> key: PRIMARY
> key_len: 8
> ref: NULL
> rows: 6574840
> Extra: Using index
>
>
>
>
> --
> Regards,
> Wallace
> M:94500905
> _______________________________________________
> DBmail mailing list
> DBmail[at]dbmail.org
> http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
>


vulture at netvulture

Apr 3, 2009, 8:59 PM

Post #11 of 11 (877 views)
Permalink
Re: OT: DBMail Administrator (DBMA) Performance Fix [In reply to]

Wallace Tan wrote:
> And what's the best admin interface for DBMail?
>
>

Not to toot my own horn, but I did up a php based administration tool
for managing your mailboxes and aliases along with added functionality
to partition your users into manageable groups. See
http://www.netvulture.com/nvcapps.php to download. Extract to your
htdocs and follow INSTALL to get it up and running.

-Jon

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
DBmail mailing list
DBmail[at]dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail

DBMail users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.