
sct at redhat
Oct 21, 1999, 4:08 AM
Post #5 of 7
(1438 views)
Permalink
|
Hi, On Wed, 20 Oct 1999 11:00:24 -0600, Sean Reifschneider <jafo [at] tummy> said: > On Wed, Oct 20, 1999 at 05:46:10PM +0100, Stephen C. Tweedie wrote: >> Yes --- or rather, journaling is one method by which we can make a >> filesystem transactional. > As I understand it, the journaling in ext3 (which is a fantastic > design, BTW), only handles journaling of file-system meta-data, > not of actual file data. Actually, it journals both for now. That simplifies the journaling core quite a bit, and I'm keeping the metadata-only journaling disabled until I'm sure that the core journal code is rock-solid. Full journaling will still be an option in the long term, though. In particular, for systems such as NFS servers, fast commit of written data to disk is necessary for low latency, and by allowing data to be journaled to a separate journal device, we can improve NFS write latencies enormously while retaining synchronous writes. > This is a common misconception I've run into is that it handles the > file data as well. It does. > The problem is that in the event of a crash, there's little a file-system > can do to prevent the crashed applications data from being left in an > unknown state. Absolutely. Data journaling is not application journaling. You can ensure that the data is intact on disk, but that doesn't mean that it represents a consistent state for the application. Only the application can ever have a hope of understanding what its own transaction semantics are. What filesystem data journaling _can_ offer in this situation is fast commit of filesystem transactions to disk. This is especially true if you have a separate journal disk, but even without that, committing data to disk (ie. O_SYNC writes or fsync()) becomes a simple matter of writing a single sequential record to the journal. The filesystem can propagate the in-place updates to the main on-disk structures later on at leisure, but the initial synchronous data write can be done much more quickly with filesystem data journaling. --Stephen
|