Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

Re: SPAM: Re: [patch 2/5] mm: fault vs invalidate/truncate race fix

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


npiggin at suse

Oct 11, 2006, 9:57 AM

Post #1 of 7 (694 views)
Permalink
Re: SPAM: Re: [patch 2/5] mm: fault vs invalidate/truncate race fix

On Wed, Oct 11, 2006 at 09:21:16AM -0700, Linus Torvalds wrote:
>
>
> On Tue, 10 Oct 2006, Andrew Morton wrote:
> >
> > On Wed, 11 Oct 2006 15:39:22 +1000
> > Nick Piggin <nickpiggin [at] yahoo> wrote:
> >
> > > But I see that it does read twice. Do you want that behaviour retained? It
> > > seems like at this level it would be logical to read it once and let lower
> > > layers take care of any retries?
> >
> > argh. Linus has good-sounding reasons for retrying the pagefault-path's
> > read a single time, but I forget what they are. Something to do with
> > networked filesystems? (adds cc)
>
> Indeed. We _have_ to re-try a failed IO that we didn't start ourselves.
>
> The original IO could have been started by a person who didn't have
> permissions to actually carry it out successfully, so if you enter with
> the page locked (because somebody else started the IO), and you wait for
> the page and it's not up-to-date afterwards, you absolutely _have_ to try
> the IO, and can only return a real IO error after your _own_ IO has
> failed.

Sure, but we currently try to read _twice_, don't we?

> There is another issue too: even if the page was marked as having an error
> when we entered (and no longer locked - maybe the IO failed last time
> around), we should _still_ re-try. It might be a temporary error that has
> since gone away, and if we don't re-try, we can end up in the totally
> untenable situation where the kernel makes a soft error into a hard one.

Yes, and in that case I think the page should be !Uptodate, so no
problem there.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Oct 11, 2006, 10:11 AM

Post #2 of 7 (664 views)
Permalink
Re: SPAM: Re: [patch 2/5] mm: fault vs invalidate/truncate race fix [In reply to]

On Wed, 11 Oct 2006, Nick Piggin wrote:
> >
> > The original IO could have been started by a person who didn't have
> > permissions to actually carry it out successfully, so if you enter with
> > the page locked (because somebody else started the IO), and you wait for
> > the page and it's not up-to-date afterwards, you absolutely _have_ to try
> > the IO, and can only return a real IO error after your _own_ IO has
> > failed.
>
> Sure, but we currently try to read _twice_, don't we?

Well, we have the read-ahead, and then the real read. By the time we do
the real read, we have forgotten about the read-ahead details, so..

We also end up often having a _third_ one, simply because the _user_ tries
it twice: it gets a partial IO read first, and then tries to continue and
won't give up until it gets a real error.

So yes, we can end up reading it even more than twice, if only due to
standard UNIX interfaces: you always have to have one extra "read()"
system call in order to get the final error (or - much more commonly -
EOF, of course).

If we tracked the read-aheads that _we_ started, we could probably get rid
of one of them.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


npiggin at suse

Oct 11, 2006, 10:21 AM

Post #3 of 7 (661 views)
Permalink
Re: SPAM: Re: SPAM: Re: [patch 2/5] mm: fault vs invalidate/truncate race fix [In reply to]

On Wed, Oct 11, 2006 at 10:11:43AM -0700, Linus Torvalds wrote:
>
>
> On Wed, 11 Oct 2006, Nick Piggin wrote:
> > >
> > > The original IO could have been started by a person who didn't have
> > > permissions to actually carry it out successfully, so if you enter with
> > > the page locked (because somebody else started the IO), and you wait for
> > > the page and it's not up-to-date afterwards, you absolutely _have_ to try
> > > the IO, and can only return a real IO error after your _own_ IO has
> > > failed.
> >
> > Sure, but we currently try to read _twice_, don't we?
>
> Well, we have the read-ahead, and then the real read. By the time we do
> the real read, we have forgotten about the read-ahead details, so..

I mean filemap_nopage does *two* synchronous reads when finding a !uptodate
page. This is despite the comment saying that it retries once on error.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Oct 11, 2006, 10:38 AM

Post #4 of 7 (661 views)
Permalink
Re: [patch 2/5] mm: fault vs invalidate/truncate race fix [In reply to]

On Wed, 11 Oct 2006, Nick Piggin wrote:
>
> I mean filemap_nopage does *two* synchronous reads when finding a !uptodate
> page. This is despite the comment saying that it retries once on error.

Ahh.

Yes, now that you point to the actual code, that does look ugly.

I think it's related to the

ClearPageError(page);

thing, and probably related to that function being rather old and having
gone through several re-organizations. I suspect we used to fall through
to the error handling code regardless of whether we did the read ourselves
etc.

Are you saying that something like this would be preferable?

Linus

---
diff --git a/mm/filemap.c b/mm/filemap.c
index 3464b68..e5ecf42 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1496,6 +1496,8 @@ page_not_uptodate:
goto success;
}

+ /* Clear any potential old errors, and try to read.. */
+ ClearPageError(page);
error = mapping->a_ops->readpage(file, page);
if (!error) {
wait_on_page_locked(page);
@@ -1526,21 +1528,12 @@ page_not_uptodate:
unlock_page(page);
goto success;
}
- ClearPageError(page);
- error = mapping->a_ops->readpage(file, page);
- if (!error) {
- wait_on_page_locked(page);
- if (PageUptodate(page))
- goto success;
- } else if (error == AOP_TRUNCATED_PAGE) {
- page_cache_release(page);
- goto retry_find;
- }

/*
* Things didn't work out. Return zero to tell the
* mm layer so, possibly freeing the page cache page first.
*/
+ unlock_page(page);
shrink_readahead_size_eio(file, ra);
page_cache_release(page);
return NOPAGE_SIGBUS;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


npiggin at suse

Oct 11, 2006, 8:33 PM

Post #5 of 7 (643 views)
Permalink
Re: [patch 2/5] mm: fault vs invalidate/truncate race fix [In reply to]

On Wed, Oct 11, 2006 at 10:38:31AM -0700, Linus Torvalds wrote:
>
>
> On Wed, 11 Oct 2006, Nick Piggin wrote:
> >
> > I mean filemap_nopage does *two* synchronous reads when finding a !uptodate
> > page. This is despite the comment saying that it retries once on error.
>
> Ahh.
>
> Yes, now that you point to the actual code, that does look ugly.
>
> I think it's related to the
>
> ClearPageError(page);
>
> thing, and probably related to that function being rather old and having
> gone through several re-organizations. I suspect we used to fall through
> to the error handling code regardless of whether we did the read ourselves
> etc.

Yeah, it may have even been a mismerge at some point in time.

> Are you saying that something like this would be preferable?

I think so, it is neater and clearer. I actually didn't even bother relocking
and checking the page again on readpage error so got rid of quite a bit of
code.

>
> Linus
>
> ---
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 3464b68..e5ecf42 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1496,6 +1496,8 @@ page_not_uptodate:
> goto success;
> }
>
> + /* Clear any potential old errors, and try to read.. */
> + ClearPageError(page);
> error = mapping->a_ops->readpage(file, page);
> if (!error) {
> wait_on_page_locked(page);
> @@ -1526,21 +1528,12 @@ page_not_uptodate:
> unlock_page(page);
> goto success;
> }
> - ClearPageError(page);
> - error = mapping->a_ops->readpage(file, page);
> - if (!error) {
> - wait_on_page_locked(page);
> - if (PageUptodate(page))
> - goto success;
> - } else if (error == AOP_TRUNCATED_PAGE) {
> - page_cache_release(page);
> - goto retry_find;
> - }
>
> /*
> * Things didn't work out. Return zero to tell the
> * mm layer so, possibly freeing the page cache page first.
> */
> + unlock_page(page);
> shrink_readahead_size_eio(file, ra);
> page_cache_release(page);
> return NOPAGE_SIGBUS;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at osdl

Oct 12, 2006, 8:37 AM

Post #6 of 7 (668 views)
Permalink
Re: [patch 2/5] mm: fault vs invalidate/truncate race fix [In reply to]

On Thu, 12 Oct 2006, Nick Piggin wrote:
>
> > Are you saying that something like this would be preferable?
>
> I think so, it is neater and clearer. I actually didn't even bother relocking
> and checking the page again on readpage error so got rid of quite a bit of
> code.

Well, the readpage error should be rare (and for the _normal_ case we just
do the "wait_on_page_locked()" thing). And I think we should lock the page
in order to do the truncation check, no?

But I don't have any really strong feelings. I'm certainly ok with the
patch I sent out. How about putting it through -mm? Here's my sign-off:

Signed-off-by: Linus Torvalds <torvalds [at] osdl>

if you want to send it off to Andrew (or if Andrew wants to just take it
himself ;)

Btw, how did you even notice this? Just by reading the source, or because
you actually saw multiple errors reported?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


npiggin at suse

Oct 12, 2006, 8:40 AM

Post #7 of 7 (665 views)
Permalink
Re: [patch 2/5] mm: fault vs invalidate/truncate race fix [In reply to]

On Thu, Oct 12, 2006 at 08:37:39AM -0700, Linus Torvalds wrote:
>
>
> On Thu, 12 Oct 2006, Nick Piggin wrote:
> >
> > > Are you saying that something like this would be preferable?
> >
> > I think so, it is neater and clearer. I actually didn't even bother relocking
> > and checking the page again on readpage error so got rid of quite a bit of
> > code.
>
> Well, the readpage error should be rare (and for the _normal_ case we just
> do the "wait_on_page_locked()" thing). And I think we should lock the page
> in order to do the truncation check, no?

Definitely.

> But I don't have any really strong feelings. I'm certainly ok with the
> patch I sent out. How about putting it through -mm? Here's my sign-off:
>
> Signed-off-by: Linus Torvalds <torvalds [at] osdl>
>
> if you want to send it off to Andrew (or if Andrew wants to just take it
> himself ;)

OK... maybe it can wait till the other changes, and we can think about
it then. I'll carry around the split out patct, though.

> Btw, how did you even notice this? Just by reading the source, or because
> you actually saw multiple errors reported?

Reading the source, thinking about the cleanups we can do if filemap_nopage
takes the page lock...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.