Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[patch 6/10] mm: be sure to trim blocks

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


npiggin at suse

Jan 12, 2007, 7:25 PM

Post #1 of 6 (477 views)
Permalink
[patch 6/10] mm: be sure to trim blocks

If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then
we may have failed the write operation despite prepare_write having
instantiated blocks past i_size. Fix this, and consolidate the trimming into
one place.

Signed-off-by: Nick Piggin <npiggin [at] suse>

Index: linux-2.6/mm/filemap.c
===================================================================
--- linux-2.6.orig/mm/filemap.c
+++ linux-2.6/mm/filemap.c
@@ -1911,22 +1911,9 @@ generic_file_buffered_write(struct kiocb
}

status = a_ops->prepare_write(file, page, offset, offset+bytes);
- if (unlikely(status)) {
- loff_t isize = i_size_read(inode);
+ if (unlikely(status))
+ goto fs_write_aop_error;

- if (status != AOP_TRUNCATED_PAGE)
- unlock_page(page);
- page_cache_release(page);
- if (status == AOP_TRUNCATED_PAGE)
- continue;
- /*
- * prepare_write() may have instantiated a few blocks
- * outside i_size. Trim these off again.
- */
- if (pos + bytes > isize)
- vmtruncate(inode, isize);
- break;
- }
if (likely(nr_segs == 1))
copied = filemap_copy_from_user(page, offset,
buf, bytes);
@@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
cur_iov, iov_offset, bytes);
flush_dcache_page(page);
status = a_ops->commit_write(file, page, offset, offset+bytes);
- if (status == AOP_TRUNCATED_PAGE) {
- page_cache_release(page);
- continue;
- }
+ if (unlikely(status))
+ goto fs_write_aop_error;
+
if (likely(copied > 0)) {
if (!status)
status = copied;
@@ -1969,6 +1955,25 @@ generic_file_buffered_write(struct kiocb
break;
balance_dirty_pages_ratelimited(mapping);
cond_resched();
+ continue;
+
+fs_write_aop_error:
+ if (status != AOP_TRUNCATED_PAGE)
+ unlock_page(page);
+ page_cache_release(page);
+
+ /*
+ * prepare_write() may have instantiated a few blocks
+ * outside i_size. Trim these off again. Don't need
+ * i_size_read because we hold i_mutex.
+ */
+ if (pos + bytes > inode->i_size)
+ vmtruncate(inode, inode->i_size);
+ if (status == AOP_TRUNCATED_PAGE)
+ continue;
+ else
+ break;
+
} while (count);
*ppos = pos;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dmonakhov at sw

Jan 14, 2007, 6:25 AM

Post #2 of 6 (461 views)
Permalink
Re: [patch 6/10] mm: be sure to trim blocks [In reply to]

Nick Piggin <npiggin [at] suse> writes:

> If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then
> we may have failed the write operation despite prepare_write having
> instantiated blocks past i_size. Fix this, and consolidate the trimming into
> one place.
>
> Signed-off-by: Nick Piggin <npiggin [at] suse>
>
> Index: linux-2.6/mm/filemap.c
> ===================================================================
> --- linux-2.6.orig/mm/filemap.c
> +++ linux-2.6/mm/filemap.c
> @@ -1911,22 +1911,9 @@ generic_file_buffered_write(struct kiocb
> }
>
> status = a_ops->prepare_write(file, page, offset, offset+bytes);
> - if (unlikely(status)) {
> - loff_t isize = i_size_read(inode);
> + if (unlikely(status))
> + goto fs_write_aop_error;
May be it's stupid question but still..
Why we treat non zero prepare_write() return code as error, it may be positive.
Positive error code may be used as fine grained 'bytes' limiter in case of
blksize < pgsize as follows:

status = a_ops->prepare_write(file, page, offset, offset+bytes);
if (unlikely(status)) {
if (status > 0) {
bytes = min(bytes, status);
status = 0;
} else {
goto fs_write_aop_error;
}
}
---
This is useful because fs may want to reduce 'bytes' by number of reasons,
for example make it blksize bound.
Example : filesystem has 1k blksize and only two free blocks. And we try
write 4k bytes.
Currently write(fd, buff, 4096) will return -ENOSPC
But after this fix write(fd, buff, 4096) will return as mutch as it can (2048).
>
> - if (status != AOP_TRUNCATED_PAGE)
> - unlock_page(page);
> - page_cache_release(page);
> - if (status == AOP_TRUNCATED_PAGE)
> - continue;
> - /*
> - * prepare_write() may have instantiated a few blocks
> - * outside i_size. Trim these off again.
> - */
> - if (pos + bytes > isize)
> - vmtruncate(inode, isize);
> - break;
> - }
> if (likely(nr_segs == 1))
> copied = filemap_copy_from_user(page, offset,
> buf, bytes);
> @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
> cur_iov, iov_offset, bytes);
> flush_dcache_page(page);
> status = a_ops->commit_write(file, page, offset, offset+bytes);
> - if (status == AOP_TRUNCATED_PAGE) {
> - page_cache_release(page);
> - continue;
> - }
> + if (unlikely(status))
> + goto fs_write_aop_error;
> +
> if (likely(copied > 0)) {
> if (!status)
> status = copied;
> @@ -1969,6 +1955,25 @@ generic_file_buffered_write(struct kiocb
> break;
> balance_dirty_pages_ratelimited(mapping);
> cond_resched();
> + continue;
> +
> +fs_write_aop_error:
> + if (status != AOP_TRUNCATED_PAGE)
> + unlock_page(page);
> + page_cache_release(page);
> +
> + /*
> + * prepare_write() may have instantiated a few blocks
> + * outside i_size. Trim these off again. Don't need
> + * i_size_read because we hold i_mutex.
> + */
> + if (pos + bytes > inode->i_size)
> + vmtruncate(inode, inode->i_size);
> + if (status == AOP_TRUNCATED_PAGE)
> + continue;
> + else
> + break;
> +
> } while (count);
> *ppos = pos;
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo [at] vger
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


a.p.zijlstra at chello

Jan 16, 2007, 9:36 AM

Post #3 of 6 (452 views)
Permalink
Re: [patch 6/10] mm: be sure to trim blocks [In reply to]

On Sat, 2007-01-13 at 04:25 +0100, Nick Piggin wrote:
> If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then
> we may have failed the write operation despite prepare_write having
> instantiated blocks past i_size. Fix this, and consolidate the trimming into
> one place.
>
> Signed-off-by: Nick Piggin <npiggin [at] suse>
>
> Index: linux-2.6/mm/filemap.c
> ===================================================================
> --- linux-2.6.orig/mm/filemap.c
> +++ linux-2.6/mm/filemap.c
> @@ -1911,22 +1911,9 @@ generic_file_buffered_write(struct kiocb
> }
>
> status = a_ops->prepare_write(file, page, offset, offset+bytes);
> - if (unlikely(status)) {
> - loff_t isize = i_size_read(inode);
> + if (unlikely(status))
> + goto fs_write_aop_error;
>
> - if (status != AOP_TRUNCATED_PAGE)
> - unlock_page(page);
> - page_cache_release(page);
> - if (status == AOP_TRUNCATED_PAGE)
> - continue;
> - /*
> - * prepare_write() may have instantiated a few blocks
> - * outside i_size. Trim these off again.
> - */
> - if (pos + bytes > isize)
> - vmtruncate(inode, isize);
> - break;
> - }
> if (likely(nr_segs == 1))
> copied = filemap_copy_from_user(page, offset,
> buf, bytes);
> @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
> cur_iov, iov_offset, bytes);
> flush_dcache_page(page);
> status = a_ops->commit_write(file, page, offset, offset+bytes);
> - if (status == AOP_TRUNCATED_PAGE) {
> - page_cache_release(page);
> - continue;
> - }
> + if (unlikely(status))
> + goto fs_write_aop_error;
> +

I don't think this is correct, see how status >= 0 is used a few lines
downwards. Perhaps something along the lines of an
is_positive_aop_return() to test on?

> if (likely(copied > 0)) {
> if (!status)
> status = copied;
> @@ -1969,6 +1955,25 @@ generic_file_buffered_write(struct kiocb
> break;
> balance_dirty_pages_ratelimited(mapping);
> cond_resched();
> + continue;
> +
> +fs_write_aop_error:
> + if (status != AOP_TRUNCATED_PAGE)
> + unlock_page(page);
> + page_cache_release(page);
> +
> + /*
> + * prepare_write() may have instantiated a few blocks
> + * outside i_size. Trim these off again. Don't need
> + * i_size_read because we hold i_mutex.
> + */
> + if (pos + bytes > inode->i_size)
> + vmtruncate(inode, inode->i_size);
> + if (status == AOP_TRUNCATED_PAGE)
> + continue;
> + else
> + break;
> +
> } while (count);
> *ppos = pos;


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


a.p.zijlstra at chello

Jan 16, 2007, 11:14 AM

Post #4 of 6 (475 views)
Permalink
Re: [patch 6/10] mm: be sure to trim blocks [In reply to]

On Tue, 2007-01-16 at 18:36 +0100, Peter Zijlstra wrote:
> buf, bytes);
> > @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
> > cur_iov, iov_offset, bytes);
> > flush_dcache_page(page);
> > status = a_ops->commit_write(file, page, offset, offset+bytes);
> > - if (status == AOP_TRUNCATED_PAGE) {
> > - page_cache_release(page);
> > - continue;
> > - }
> > + if (unlikely(status))
> > + goto fs_write_aop_error;
> > +
>
> I don't think this is correct, see how status >= 0 is used a few lines
> downwards. Perhaps something along the lines of an
> is_positive_aop_return() to test on?

Hmm, if commit_write() will never return non error positive values then
this and 8/10 look sane.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


npiggin at suse

Jan 19, 2007, 7:50 PM

Post #5 of 6 (452 views)
Permalink
Re: [patch 6/10] mm: be sure to trim blocks [In reply to]

On Sun, Jan 14, 2007 at 05:25:44PM +0300, Dmitriy Monakhov wrote:
> Nick Piggin <npiggin [at] suse> writes:
>
> > If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then
> > we may have failed the write operation despite prepare_write having
> > instantiated blocks past i_size. Fix this, and consolidate the trimming into
> > one place.
> >
> > Signed-off-by: Nick Piggin <npiggin [at] suse>
> >
> > Index: linux-2.6/mm/filemap.c
> > ===================================================================
> > --- linux-2.6.orig/mm/filemap.c
> > +++ linux-2.6/mm/filemap.c
> > @@ -1911,22 +1911,9 @@ generic_file_buffered_write(struct kiocb
> > }
> >
> > status = a_ops->prepare_write(file, page, offset, offset+bytes);
> > - if (unlikely(status)) {
> > - loff_t isize = i_size_read(inode);
> > + if (unlikely(status))
> > + goto fs_write_aop_error;
> May be it's stupid question but still..
> Why we treat non zero prepare_write() return code as error, it may be positive.
> Positive error code may be used as fine grained 'bytes' limiter in case of
> blksize < pgsize as follows:
>
> status = a_ops->prepare_write(file, page, offset, offset+bytes);
> if (unlikely(status)) {
> if (status > 0) {
> bytes = min(bytes, status);
> status = 0;
> } else {
> goto fs_write_aop_error;
> }
> }
> ---
> This is useful because fs may want to reduce 'bytes' by number of reasons,
> for example make it blksize bound.
> Example : filesystem has 1k blksize and only two free blocks. And we try
> write 4k bytes.
> Currently write(fd, buff, 4096) will return -ENOSPC
> But after this fix write(fd, buff, 4096) will return as mutch as it can (2048).

It isn't a stupid question. Hmm, while it isn't documented in vfs.txt, it
seems like some filesystems actually do this. AFFS, maybe JFFS. So good
catch, thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


npiggin at suse

Jan 19, 2007, 7:52 PM

Post #6 of 6 (473 views)
Permalink
Re: [patch 6/10] mm: be sure to trim blocks [In reply to]

On Tue, Jan 16, 2007 at 08:14:16PM +0100, Peter Zijlstra wrote:
> On Tue, 2007-01-16 at 18:36 +0100, Peter Zijlstra wrote:
> > buf, bytes);
> > > @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
> > > cur_iov, iov_offset, bytes);
> > > flush_dcache_page(page);
> > > status = a_ops->commit_write(file, page, offset, offset+bytes);
> > > - if (status == AOP_TRUNCATED_PAGE) {
> > > - page_cache_release(page);
> > > - continue;
> > > - }
> > > + if (unlikely(status))
> > > + goto fs_write_aop_error;
> > > +
> >
> > I don't think this is correct, see how status >= 0 is used a few lines
> > downwards. Perhaps something along the lines of an
> > is_positive_aop_return() to test on?
>
> Hmm, if commit_write() will never return non error positive values then
> this and 8/10 look sane.

It's really ugly, but it looks like at least some filesystems do. So
I'll fix up this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.