Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Qmail: users

Maildir filesystem advice

 

 

Qmail users RSS feed   Index | Next | Previous | View Threaded


oliver at net-track

Jun 6, 2008, 2:39 AM

Post #1 of 15 (1278 views)
Permalink
Maildir filesystem advice

Hi all,

We have a file server that serves maildirs for about 200k users. For
historical reasons, there's only one big 400G filesystem, formatted
using ReiserFS.

The maildirs are stored in a two-level domain/user hierachy. We have
around 5000 subdirectories on the first level -- which is very
unfortunate, but the application that places the domain directories is
out of my scope and cannot be changed.

Recently, we started seeing ugly filesystem inconsistencies, which is
why we'd like to restructure the whole thing differently. Instead of one
big filesystem, we'd like to have multiple smaller systems, which would
isolate corruptions to smaller areas and also allow us to do fsck's in
acceptable time.

Since I cannot change the application, I somehow need to have 5000
subdirectories on the same level. I thought of using something like
unionfs or aufs for this. I'd create let's say 10 smaller filesystems
and merge them together into one big hierarchy.

Can anybody think of another approach? (Other than fixing the
application to support multiple filesystems, which would of course be
the best solution.) Is anybody here already using unionfs/aufs for
something like this?

Also, what filesystem do you guys recommend and how do you back up huge
maildir partitions? At the moment we do our backups using rsync, which
is why we were forced to use ReiserFS as filesystem. With ext3 it was
not possible to back up all the maildirs in acceptable time, just going
through the directories took ages (I think this was with dir_index
enabled, but I'm not sure). In future, I'd like to get rid of ReiserFS
and use something more robust, but I first need to find a suitable
backup strategy that allows to back up data in acceptable time. How
about dump on ext3-formatted LVM snapshots?

I'm sure some people on this list have servers that are far bigger. I'd
love to hear how you store and backup your maildirs. Maybe there's some
approach that I've missed altogether.

Thanks in advance for any hints,

Oliver
Attachments: signature.asc (0.18 KB)


qmail08 at hofmann-wi

Jun 6, 2008, 5:50 AM

Post #2 of 15 (1253 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Hi Oliver,

why not just symlink from other partitions? Then you can use/move how
you'd like.

Lars

Am Freitag, den 06.06.2008, 11:39 +0200 schrieb Oliver Hitz:
> Hi all,
>
> We have a file server that serves maildirs for about 200k users. For
> historical reasons, there's only one big 400G filesystem, formatted
> using ReiserFS.
>
> The maildirs are stored in a two-level domain/user hierachy. We have
> around 5000 subdirectories on the first level -- which is very
> unfortunate, but the application that places the domain directories is
> out of my scope and cannot be changed.
>
> Recently, we started seeing ugly filesystem inconsistencies, which is
> why we'd like to restructure the whole thing differently. Instead of one
> big filesystem, we'd like to have multiple smaller systems, which would
> isolate corruptions to smaller areas and also allow us to do fsck's in
> acceptable time.
>
> Since I cannot change the application, I somehow need to have 5000
> subdirectories on the same level. I thought of using something like
> unionfs or aufs for this. I'd create let's say 10 smaller filesystems
> and merge them together into one big hierarchy.
>
> Can anybody think of another approach? (Other than fixing the
> application to support multiple filesystems, which would of course be
> the best solution.) Is anybody here already using unionfs/aufs for
> something like this?
>
> Also, what filesystem do you guys recommend and how do you back up huge
> maildir partitions? At the moment we do our backups using rsync, which
> is why we were forced to use ReiserFS as filesystem. With ext3 it was
> not possible to back up all the maildirs in acceptable time, just going
> through the directories took ages (I think this was with dir_index
> enabled, but I'm not sure). In future, I'd like to get rid of ReiserFS
> and use something more robust, but I first need to find a suitable
> backup strategy that allows to back up data in acceptable time. How
> about dump on ext3-formatted LVM snapshots?
>
> I'm sure some people on this list have servers that are far bigger. I'd
> love to hear how you store and backup your maildirs. Maybe there's some
> approach that I've missed altogether.
>
> Thanks in advance for any hints,
>
> Oliver


dl at blackpacket

Jun 6, 2008, 7:21 AM

Post #3 of 15 (1256 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Lars Hofmann wrote:
> Hi Oliver,
>
> why not just symlink from other partitions? Then you can use/move how
> you'd like.
>
> Lars
>
This would also be my suggestion.

Prepare each of the mounts elsewhere, and symlink the domain directories
in. This will require a lot of maintenance, however, if the virtual
domain software you are using supports "letter" directories, eg:
/.../domains/e/example.com then all you need to do is symlink the
letters/numbers.

Alternatively, you can buy a NAS solution and just mount that in over
NFS or iSCSI. That will alleviate I/O load, and depending on which one
you buy, will have it's own data protection schemes and fast recovery
options built in.

Also, you mention moving away from reiser.. as far as I'm aware, reiser
is the best filesystem for small file support (due to tailing). That
means you may see a performance decrease when you move away from it.
Ext3 also has a limited number of inodes set when you create the
filesystem, so you will have a static maximum number of files &
directories. Unless you are using reiser4, or earlier 3.4 versions,
then there should be very few problems with reiser, unless the system is
crashing/freezing/etc (and most of that should be solved by mounting it
& replaying the journal). On the other hand, I have seen
self-corrupting filesystems before (it was with a few RedHat ext3
boxes), so you may have a bad kernel fs driver as well.

Tyler


oliver at net-track

Jun 6, 2008, 7:47 AM

Post #4 of 15 (1256 views)
Permalink
Re: Maildir filesystem advice [In reply to]

On 06 Jun 2008, Tyler wrote:
> Lars Hofmann wrote:
> >why not just symlink from other partitions? Then you can use/move how
> >you'd like.
> This would also be my suggestion.

Thanks Lars and Tyler. Sometimes I can't see the wood from the trees...
Yes, symlinks are probably easier and less error prone that yet another
complex filesystem layer in between complex stuff.

> Alternatively, you can buy a NAS solution and just mount that in over
> NFS or iSCSI. That will alleviate I/O load, and depending on which one
> you buy, will have it's own data protection schemes and fast recovery
> options built in.

I see this as a long-term solution. However, I fear that for the next
couple of months I am stuck with the equipment I have at hand.

> Also, you mention moving away from reiser.. as far as I'm aware, reiser
> is the best filesystem for small file support (due to tailing). That

Thanks for your input. I'll definitely do some tests before I decide to
move towards ext3. (One advantage of using the smaller partitions is
that I'll even be able to mix filesystems and migrate from one
filesystem to another one partition at a time -- i.e. multiple shorter
downtimes rather than a single big one.)

Regards

Oliver
Attachments: signature.asc (0.18 KB)


search-web-for-address at pyropus

Jun 6, 2008, 8:36 AM

Post #5 of 15 (1256 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Oliver Hitz <oliver[at]net-track.ch> wrote:
>
> > Also, you mention moving away from reiser.. as far as I'm aware, reiser
> > is the best filesystem for small file support (due to tailing). That
>
> Thanks for your input. I'll definitely do some tests before I decide to
> move towards ext3.

Note that ext3 is not necessarily a great choice either; maildir operations
require lots of fsync() calls, and ext3 sucks at fsync() -- the whole
filesystem is synced each time fsync() is called, stalling all I/O operations
to that filesystem until the fsync() completes.

I think Ted T'so is trying to fix this in ext4, but some of the other
filesystems (XFS? JFFS?) don't have this problem -- it'll be worth trying the
alternatives.

Charles
--
--------------------------------------------------------------------------
Charles Cazabon
GPL'ed software available at: http://pyropus.ca/software/
Read http://pyropus.ca/personal/writings/12-steps-to-qmail-list-bliss.html
--------------------------------------------------------------------------


search-web-for-address at pyropus

Jun 6, 2008, 1:01 PM

Post #6 of 15 (1237 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Robin Bowes <robin-lists[at]robinbowes.com> wrote:
> > Note that ext3 is not necessarily a great choice either;
[...]
> > it'll be worth trying the alternatives.
>
> I'd be interested in reading a more scientific analysis of this sort of
> thing, i.e. best FS for maildir.
>
> Has anyone done anything like that yet?

I don't recall seeing it. Bruce Guenter did some similar-minded testing from
the perspective of the qmail queue:
http://untroubled.org/benchmarking/qmail-filesystems/

> What would be required - I could possibly add it to my do-do list if
> someone can help me in making sure I do the right sort of testing!

That's pretty easy, actually. Get a real workload (you could do this by
logging from your real server) -- i.e. a number of inbound messages with their
sizes and local/virtual recipients. Generate messages of those lengths, and
start feeding them to a safe maildir delivery program in various levels of
parallelism for the correct local recipients.

So: see how long this particular real-world set of messages would take (and
what resources are consumed) to deliver to their maildir destinations on
various filesystems with delivery parallelism of 1, 2, 4, 8, etc.

Charles
--
--------------------------------------------------------------------------
Charles Cazabon
GPL'ed software available at: http://pyropus.ca/software/
Read http://pyropus.ca/personal/writings/12-steps-to-qmail-list-bliss.html
--------------------------------------------------------------------------


robin-lists at robinbowes

Jun 6, 2008, 4:36 PM

Post #7 of 15 (1238 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Charles Cazabon wrote:
> Oliver Hitz <oliver[at]net-track.ch> wrote:
>>> Also, you mention moving away from reiser.. as far as I'm aware, reiser
>>> is the best filesystem for small file support (due to tailing). That
>> Thanks for your input. I'll definitely do some tests before I decide to
>> move towards ext3.
>
> Note that ext3 is not necessarily a great choice either; maildir operations
> require lots of fsync() calls, and ext3 sucks at fsync() -- the whole
> filesystem is synced each time fsync() is called, stalling all I/O operations
> to that filesystem until the fsync() completes.
>
> I think Ted T'so is trying to fix this in ext4, but some of the other
> filesystems (XFS? JFFS?) don't have this problem -- it'll be worth trying the
> alternatives.

I'd be interested in reading a more scientific analysis of this sort of
thing, i.e. best FS for maildir.

Has anyone done anything like that yet?

What would be required - I could possibly add it to my do-do list if
someone can help me in making sure I do the right sort of testing!

R.


oliver at net-track

Jun 8, 2008, 11:17 PM

Post #8 of 15 (1207 views)
Permalink
Re: Maildir filesystem advice [In reply to]

On 06 Jun 2008, Charles Cazabon wrote:
> Note that ext3 is not necessarily a great choice either; maildir operations
> require lots of fsync() calls, and ext3 sucks at fsync() -- the whole
> filesystem is synced each time fsync() is called, stalling all I/O operations
> to that filesystem until the fsync() completes.

Thanks Charles, I'll keep this in mind.

Up to now, however, it was just the backup performance that caused us
headaches (in addition to the filesystem inconsistencies, of course).

Regards

Oliver
Attachments: signature.asc (0.18 KB)


lampacz at gmail

Jun 8, 2008, 11:42 PM

Post #9 of 15 (1211 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Hello,

i was using reiserfs for 3 years, now i must migrate to xfs(my choice)
or jfs. Why ? With lot of files (500k+) reiserfs crashed and takes 3
hours to repair drive. After repair some data was missing. DIrectory
with 500K+ files unreadable, some data was missing. Other server was
build on xfs. IMHO xfs is slower with small files operations (delete,
move) and it takes more cpu that reiserfs. Tried different mount
options for each partition but reiserfs was alwasy faster. But in my
case must migrate to other than reiserfs.


2008/6/9, Oliver Hitz <oliver[at]net-track.ch>:
> On 06 Jun 2008, Charles Cazabon wrote:
> > Note that ext3 is not necessarily a great choice either; maildir operations
> > require lots of fsync() calls, and ext3 sucks at fsync() -- the whole
> > filesystem is synced each time fsync() is called, stalling all I/O operations
> > to that filesystem until the fsync() completes.
>
>
> Thanks Charles, I'll keep this in mind.
>
> Up to now, however, it was just the backup performance that caused us
> headaches (in addition to the filesystem inconsistencies, of course).
>
> Regards
>
>
> Oliver
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFITMsP4hxFYtsxPogRAgDPAJwJaSCfiLEBWxSLm5YVkgWWXXIYoQCglcui
> HMOqIpqJietkvKsGa+7wWXs=
> =c8KT
> -----END PGP SIGNATURE-----
>
>
>


--
Lampa


jeff at doeshosting

Jun 9, 2008, 4:10 AM

Post #10 of 15 (1177 views)
Permalink
Re: Maildir filesystem advice [In reply to]

On Jun 9, 2008, at 6:35 AM, Robin Bowes wrote:
>
> If I were commissioning a new mail box right now I'd investigate
> using ZFS on either *BSD or OpenSolaris.
>
> R.
>
>


ZFS in FreeBSD is not ready for prime time yet. At least not on a
32bit system. If anyone out there is getting a different impression
on 64bit hardware after heavily testing io I would like to hear about
it, and I would second the suggestion.
For now, know that when you load the kernel module, you get this
warning:
WARNING: ZFS is considered to be an experimental feature in FreeBSD.

-krzee


search-web-for-address at pyropus

Jun 9, 2008, 6:19 AM

Post #11 of 15 (1197 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Robin Bowes <robin-lists[at]robinbowes.com> wrote:
> >
> > i was using reiserfs for 3 years, now i must migrate to xfs(my choice) or
> > jfs.
>
> If I were commissioning a new mail box right now I'd investigate using
> ZFS on either *BSD or OpenSolaris.

I have heard many people say good things about ZFS, but my one experience with
it was with a file server that was converted from Linux+ext3 to
OpenSolaris+ZFS -- and the Solaris file server was molasses-slow. After the
sysadmin spent a couple of weeks trying to tune it up to the speed of the
previous (unoptimized) Linux configuration, I converted it back.

Charles
--
--------------------------------------------------------------------------
Charles Cazabon
GPL'ed software available at: http://pyropus.ca/software/
Read http://pyropus.ca/personal/writings/12-steps-to-qmail-list-bliss.html
--------------------------------------------------------------------------


robin-lists at robinbowes

Jun 9, 2008, 6:35 AM

Post #12 of 15 (1196 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Lampa wrote:
> Hello,
>
> i was using reiserfs for 3 years, now i must migrate to xfs(my choice)
> or jfs. Why ? With lot of files (500k+) reiserfs crashed and takes 3
> hours to repair drive. After repair some data was missing. DIrectory
> with 500K+ files unreadable, some data was missing. Other server was
> build on xfs. IMHO xfs is slower with small files operations (delete,
> move) and it takes more cpu that reiserfs. Tried different mount
> options for each partition but reiserfs was alwasy faster. But in my
> case must migrate to other than reiserfs.

If I were commissioning a new mail box right now I'd investigate using
ZFS on either *BSD or OpenSolaris.

R.


vadud3 at gmail

Jun 9, 2008, 6:56 AM

Post #13 of 15 (1189 views)
Permalink
Re: Maildir filesystem advice [In reply to]

On Mon, Jun 9, 2008 at 9:19 AM, Charles Cazabon <
search-web-for-address[at]pyropus.ca> wrote:

> Robin Bowes <robin-lists[at]robinbowes.com> wrote:
> > >
> > > i was using reiserfs for 3 years, now i must migrate to xfs(my choice)
> or
> > > jfs.
> >
> > If I were commissioning a new mail box right now I'd investigate using
> > ZFS on either *BSD or OpenSolaris.
>
> I have heard many people say good things about ZFS, but my one experience
> with
> it was with a file server that was converted from Linux+ext3 to
> OpenSolaris+ZFS -- and the Solaris file server was molasses-slow. After
> the
> sysadmin spent a couple of weeks trying to tune it up to the speed of the
> previous (unoptimized) Linux configuration, I converted it back.
>

disks are all internal? how many disks? was it a raid0 or raidz? zfs intent
log (zil) on same disk or different disk?


Those are some of the key factors to improve zfs performance. Also zfs is a
moving target and some more improvements on the way. May be your sysadmin to
bring the issue to opensolaris discussion channel and mailing list, if he
has not done so.



> Charles
> --
> --------------------------------------------------------------------------
> Charles Cazabon
> GPL'ed software available at: http://pyropus.ca/software/
> Read http://pyropus.ca/personal/writings/12-steps-to-qmail-list-bliss.html
> --------------------------------------------------------------------------
>



--
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu


robin-lists at robinbowes

Jun 9, 2008, 8:33 AM

Post #14 of 15 (1196 views)
Permalink
Re: Maildir filesystem advice [In reply to]

Charles Cazabon wrote:
> Robin Bowes <robin-lists[at]robinbowes.com> wrote:
>>> i was using reiserfs for 3 years, now i must migrate to xfs(my choice) or
>>> jfs.
>> If I were commissioning a new mail box right now I'd investigate using
>> ZFS on either *BSD or OpenSolaris.
>
> I have heard many people say good things about ZFS, but my one experience with
> it was with a file server that was converted from Linux+ext3 to
> OpenSolaris+ZFS -- and the Solaris file server was molasses-slow. After the
> sysadmin spent a couple of weeks trying to tune it up to the speed of the
> previous (unoptimized) Linux configuration, I converted it back.

Hmm, interesting. I've had nothing but good experiences with ZFS. What
sort of box was it running in? It really needs 64-bit, and *lots* of
RAM, i.e. minimum of 2GB, preferably lots more.

R.


oliver at net-track

Jul 14, 2008, 12:52 PM

Post #15 of 15 (566 views)
Permalink
Re: Maildir filesystem advice [SOLVED] [In reply to]

Hi all,

Just in case anybody is interested, a quick follow-up on how we
eventually solved our problems:

- Partitions: On the new server, we have multiple 100 GB partitions
instead of the old 400 GB one. We use the symlinks approach that some
of you have suggested to merge all the directories to a central
hierarchy. Works great!

- Filesystem: Unfortunately, we didn't have the time to do detailed
benchmarks, because the old server broke down completely while we were
preparing the new one and so we had to migrate faster than we had
planned. After some quick tests with XFS and ext3, we decided for XFS.
The performance is fine. The new server does its task just as good as
the old one (except for the backup, there was no performance
bottleneck before, so there's nothing to compare for us here).

- Backup: The main thing: forget rsync! rsync may be a nice solution for
many backup problems, but as soon as millions of small files and lots
of directories have to be backed up, the rsync approach is too slow.

On the old server, one daily backup run with rsync took around 20
hours (if done sequentially, of course we were able to speed things up
a bit by doing several areas in parallel, but that was still too slow
and put heavy load on the server).

On the new server, we use xfsdump. A full backup takes some 10 hours,
and a daily incremental backup takes a little more than one hour (and
we haven't tried to speed this up yet!).

Thanks to all of you who shared their thoughts!

Regards

Oliver
Attachments: signature.asc (0.18 KB)

Qmail users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.