Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


rmh3093 at gmail

Jul 3, 2008, 11:26 PM

Post #1 of 17 (277 views)
Permalink
Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock

I have tested this with 2.6.26-rc5-mm3 and with 2.6.26-rc8 w/ the
get_user_pages_fast patches from -rc5-mm3... Xorg 1.4.99.* will start
to load but hangs at a black screen. At this point, I can not switch
to another tty. When I try pressing ctrl+alt+del the kernel ooopses
and the caps lock led starts to blink. This happens using the nv,
radeon and radeonhd drivers (the nv was tested on another box
obviously). I have also tried to unselect HAVE_GET_USER_PAGES_FAST in
my kernel config but this does not help. I can not figure out where or
what the bug is. I can provide any other info you guys need to figure
this out. Let me know what I can do.

-Ryan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


zlynx at acm

Jul 4, 2008, 9:29 AM

Post #2 of 17 (253 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

Ryan Hope wrote:
> I have tested this with 2.6.26-rc5-mm3 and with 2.6.26-rc8 w/ the
> get_user_pages_fast patches from -rc5-mm3... Xorg 1.4.99.* will start
> to load but hangs at a black screen. At this point, I can not switch
> to another tty. When I try pressing ctrl+alt+del the kernel ooopses
> and the caps lock led starts to blink. This happens using the nv,
> radeon and radeonhd drivers (the nv was tested on another box
> obviously). I have also tried to unselect HAVE_GET_USER_PAGES_FAST in
> my kernel config but this does not help. I can not figure out where or
> what the bug is. I can provide any other info you guys need to figure
> this out. Let me know what I can do.


I think that I've seen this too on 2.6.26-rc8 on my laptop. It's 64-bit
AMD-64 single core (although I run a SMP-alternatives kernel on it) and
I was using the nv driver.

The symptoms are the same. Everything was working great until I started
X. I SSH'd in and got the dmesg after X locked up. The full dmesg and
config are attached because Thunderbird is a very stupid program and
won't let me paste without wrapping.

Here is a bit of what I got though.

[ 269.224276] BUG: unable to handle kernel paging request at
ffffe20003480000
[ 269.224291] IP: [<ffffffff80297e30>] copy_page_range+0x520/0x760
[ 269.224306] PGD 1102067 PUD 1103067 PMD 0
[ 269.224315] Oops: 0000 [1] SMP

[ 269.224473] Call Trace:
[ 269.224495] [<ffffffff80239a9b>] dup_mm+0x26b/0x3c0
[ 269.224507] [<ffffffff8023a84c>] copy_process+0xc2c/0x1210
[ 269.224518] [<ffffffff8023aea3>] do_fork+0x73/0x310
[ 269.224526] [<ffffffff8024719e>] sys_rt_sigaction+0x8e/0xd0
[ 269.224536] [<ffffffff8020c2db>] system_call_after_swapgs+0x7b/0x80
[ 269.224542] [<ffffffff8020c5d7>] ptregscall_common+0x67/0xb0
Attachments: dmesg.log (46.4 KB)
  .config (49.5 KB)


hugh at veritas

Jul 4, 2008, 1:29 PM

Post #3 of 17 (254 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Fri, 4 Jul 2008, Zan Lynx wrote:
> Ryan Hope wrote:
> > I have tested this with 2.6.26-rc5-mm3 and with 2.6.26-rc8 w/ the
> > get_user_pages_fast patches from -rc5-mm3... Xorg 1.4.99.* will start
> > to load but hangs at a black screen. At this point, I can not switch
> > to another tty. When I try pressing ctrl+alt+del the kernel ooopses
> > and the caps lock led starts to blink. This happens using the nv,
> > radeon and radeonhd drivers (the nv was tested on another box
> > obviously). I have also tried to unselect HAVE_GET_USER_PAGES_FAST in
> > my kernel config but this does not help. I can not figure out where or
> > what the bug is. I can provide any other info you guys need to figure
> > this out. Let me know what I can do.
>
> I think that I've seen this too on 2.6.26-rc8 on my laptop. It's 64-bit

Your attachment tells us it's actually 2.6.26-rc8-mm1, less of a worry.

> AMD-64 single core (although I run a SMP-alternatives kernel on it) and I was
> using the nv driver.
>
> The symptoms are the same. Everything was working great until I started X. I
> SSH'd in and got the dmesg after X locked up. The full dmesg and config are
> attached because Thunderbird is a very stupid program and won't let me paste
> without wrapping.
>
> Here is a bit of what I got though.
>
> [ 269.224276] BUG: unable to handle kernel paging request at ffffe20003480000
> [ 269.224291] IP: [<ffffffff80297e30>] copy_page_range+0x520/0x760
> [ 269.224306] PGD 1102067 PUD 1103067 PMD 0
> [ 269.224315] Oops: 0000 [1] SMP
>
> [ 269.224473] Call Trace:
> [ 269.224495] [<ffffffff80239a9b>] dup_mm+0x26b/0x3c0
> [ 269.224507] [<ffffffff8023a84c>] copy_process+0xc2c/0x1210
> [ 269.224518] [<ffffffff8023aea3>] do_fork+0x73/0x310
> [ 269.224526] [<ffffffff8024719e>] sys_rt_sigaction+0x8e/0xd0
> [ 269.224536] [<ffffffff8020c2db>] system_call_after_swapgs+0x7b/0x80
> [ 269.224542] [<ffffffff8020c5d7>] ptregscall_common+0x67/0xb0

Useful info, thank you; even more useful was the Code line in your attachment

> [ 269.224557] Code: 00 00 48 b9 00 00 00 00 00 e2 ff ff 48 21 d8 48 c1 e8 0c 48 8d 14 c5 00 00 00 00 48 c1 e0 06 48 29 d0 48 01 c1 0f 84 10 ff ff ff <48> 8b 01 48 89 ca f6 c4 40 74 04 48 8b 51 10 90 ff 42 08 90 ff

which is enough to identify the oops as in copy_one_pte's get_page(page).
Here's a patch I think we need, which I'm hoping will fix both your
crashes - please let us know - thanks a lot.

Stop mprotect's pte_modify from wiping out the x86 pte_special bit, which
caused oops thereafter when vm_normal_page thought X's abnormal was normal.

Signed-off-by: Hugh Dickins <hugh[at]veritas.com>
---
Fix to 2.6.26-rc8-mm1 x86-implement-pte_special.patch
Perhaps something similar needed for powerpc? Nick will know.

include/asm-x86/pgtable.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- 2.6.26-rc8-mm1/include/asm-x86/pgtable.h 2008-07-03 11:34:55.000000000 +0100
+++ linux/include/asm-x86/pgtable.h 2008-07-04 20:58:36.000000000 +0100
@@ -57,7 +57,7 @@

/* Set of bits not changed in pte_modify */
#define _PAGE_CHG_MASK (PTE_MASK | _PAGE_PCD | _PAGE_PWT | \
- _PAGE_ACCESSED | _PAGE_DIRTY)
+ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY)

#define _PAGE_CACHE_MASK (_PAGE_PCD | _PAGE_PWT)
#define _PAGE_CACHE_WB (0)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


rmh3093 at gmail

Jul 4, 2008, 10:26 PM

Post #4 of 17 (246 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

yep this fixes my issue, thanks

On Fri, Jul 4, 2008 at 4:29 PM, Hugh Dickins <hugh[at]veritas.com> wrote:
> On Fri, 4 Jul 2008, Zan Lynx wrote:
>> Ryan Hope wrote:
>> > I have tested this with 2.6.26-rc5-mm3 and with 2.6.26-rc8 w/ the
>> > get_user_pages_fast patches from -rc5-mm3... Xorg 1.4.99.* will start
>> > to load but hangs at a black screen. At this point, I can not switch
>> > to another tty. When I try pressing ctrl+alt+del the kernel ooopses
>> > and the caps lock led starts to blink. This happens using the nv,
>> > radeon and radeonhd drivers (the nv was tested on another box
>> > obviously). I have also tried to unselect HAVE_GET_USER_PAGES_FAST in
>> > my kernel config but this does not help. I can not figure out where or
>> > what the bug is. I can provide any other info you guys need to figure
>> > this out. Let me know what I can do.
>>
>> I think that I've seen this too on 2.6.26-rc8 on my laptop. It's 64-bit
>
> Your attachment tells us it's actually 2.6.26-rc8-mm1, less of a worry.
>
>> AMD-64 single core (although I run a SMP-alternatives kernel on it) and I was
>> using the nv driver.
>>
>> The symptoms are the same. Everything was working great until I started X. I
>> SSH'd in and got the dmesg after X locked up. The full dmesg and config are
>> attached because Thunderbird is a very stupid program and won't let me paste
>> without wrapping.
>>
>> Here is a bit of what I got though.
>>
>> [ 269.224276] BUG: unable to handle kernel paging request at ffffe20003480000
>> [ 269.224291] IP: [<ffffffff80297e30>] copy_page_range+0x520/0x760
>> [ 269.224306] PGD 1102067 PUD 1103067 PMD 0
>> [ 269.224315] Oops: 0000 [1] SMP
>>
>> [ 269.224473] Call Trace:
>> [ 269.224495] [<ffffffff80239a9b>] dup_mm+0x26b/0x3c0
>> [ 269.224507] [<ffffffff8023a84c>] copy_process+0xc2c/0x1210
>> [ 269.224518] [<ffffffff8023aea3>] do_fork+0x73/0x310
>> [ 269.224526] [<ffffffff8024719e>] sys_rt_sigaction+0x8e/0xd0
>> [ 269.224536] [<ffffffff8020c2db>] system_call_after_swapgs+0x7b/0x80
>> [ 269.224542] [<ffffffff8020c5d7>] ptregscall_common+0x67/0xb0
>
> Useful info, thank you; even more useful was the Code line in your attachment
>
>> [ 269.224557] Code: 00 00 48 b9 00 00 00 00 00 e2 ff ff 48 21 d8 48 c1 e8 0c 48 8d 14 c5 00 00 00 00 48 c1 e0 06 48 29 d0 48 01 c1 0f 84 10 ff ff ff <48> 8b 01 48 89 ca f6 c4 40 74 04 48 8b 51 10 90 ff 42 08 90 ff
>
> which is enough to identify the oops as in copy_one_pte's get_page(page).
> Here's a patch I think we need, which I'm hoping will fix both your
> crashes - please let us know - thanks a lot.
>
> Stop mprotect's pte_modify from wiping out the x86 pte_special bit, which
> caused oops thereafter when vm_normal_page thought X's abnormal was normal.
>
> Signed-off-by: Hugh Dickins <hugh[at]veritas.com>
> ---
> Fix to 2.6.26-rc8-mm1 x86-implement-pte_special.patch
> Perhaps something similar needed for powerpc? Nick will know.
>
> include/asm-x86/pgtable.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- 2.6.26-rc8-mm1/include/asm-x86/pgtable.h 2008-07-03 11:34:55.000000000 +0100
> +++ linux/include/asm-x86/pgtable.h 2008-07-04 20:58:36.000000000 +0100
> @@ -57,7 +57,7 @@
>
> /* Set of bits not changed in pte_modify */
> #define _PAGE_CHG_MASK (PTE_MASK | _PAGE_PCD | _PAGE_PWT | \
> - _PAGE_ACCESSED | _PAGE_DIRTY)
> + _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY)
>
> #define _PAGE_CACHE_MASK (_PAGE_PCD | _PAGE_PWT)
> #define _PAGE_CACHE_WB (0)
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


zlynx at acm

Jul 6, 2008, 2:03 PM

Post #5 of 17 (246 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

Hugh Dickins wrote:
> On Fri, 4 Jul 2008, Zan Lynx wrote:
[cut]
> Useful info, thank you; even more useful was the Code line in your attachment
>
>> [ 269.224557] Code: 00 00 48 b9 00 00 00 00 00 e2 ff ff 48 21 d8 48 c1 e8 0c 48 8d 14 c5 00 00 00 00 48 c1 e0 06 48 29 d0 48 01 c1 0f 84 10 ff ff ff <48> 8b 01 48 89 ca f6 c4 40 74 04 48 8b 51 10 90 ff 42 08 90 ff
>
> which is enough to identify the oops as in copy_one_pte's get_page(page).
> Here's a patch I think we need, which I'm hoping will fix both your
> crashes - please let us know - thanks a lot.
>
> Stop mprotect's pte_modify from wiping out the x86 pte_special bit, which
> caused oops thereafter when vm_normal_page thought X's abnormal was normal.
[cut patch]

I can report that this fixes things for me as well. Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nickpiggin at yahoo

Jul 7, 2008, 12:01 AM

Post #6 of 17 (245 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Saturday 05 July 2008 06:29, Hugh Dickins wrote:
> On Fri, 4 Jul 2008, Zan Lynx wrote:
> > Ryan Hope wrote:
> > > I have tested this with 2.6.26-rc5-mm3 and with 2.6.26-rc8 w/ the
> > > get_user_pages_fast patches from -rc5-mm3... Xorg 1.4.99.* will start
> > > to load but hangs at a black screen. At this point, I can not switch
> > > to another tty. When I try pressing ctrl+alt+del the kernel ooopses
> > > and the caps lock led starts to blink. This happens using the nv,
> > > radeon and radeonhd drivers (the nv was tested on another box
> > > obviously). I have also tried to unselect HAVE_GET_USER_PAGES_FAST in
> > > my kernel config but this does not help. I can not figure out where or
> > > what the bug is. I can provide any other info you guys need to figure
> > > this out. Let me know what I can do.
> >
> > I think that I've seen this too on 2.6.26-rc8 on my laptop. It's 64-bit
>
> Your attachment tells us it's actually 2.6.26-rc8-mm1, less of a worry.
>
> > AMD-64 single core (although I run a SMP-alternatives kernel on it) and I
> > was using the nv driver.
> >
> > The symptoms are the same. Everything was working great until I started
> > X. I SSH'd in and got the dmesg after X locked up. The full dmesg and
> > config are attached because Thunderbird is a very stupid program and
> > won't let me paste without wrapping.
> >
> > Here is a bit of what I got though.
> >
> > [ 269.224276] BUG: unable to handle kernel paging request at
> > ffffe20003480000 [ 269.224291] IP: [<ffffffff80297e30>]
> > copy_page_range+0x520/0x760 [ 269.224306] PGD 1102067 PUD 1103067 PMD 0
> > [ 269.224315] Oops: 0000 [1] SMP
> >
> > [ 269.224473] Call Trace:
> > [ 269.224495] [<ffffffff80239a9b>] dup_mm+0x26b/0x3c0
> > [ 269.224507] [<ffffffff8023a84c>] copy_process+0xc2c/0x1210
> > [ 269.224518] [<ffffffff8023aea3>] do_fork+0x73/0x310
> > [ 269.224526] [<ffffffff8024719e>] sys_rt_sigaction+0x8e/0xd0
> > [ 269.224536] [<ffffffff8020c2db>] system_call_after_swapgs+0x7b/0x80
> > [ 269.224542] [<ffffffff8020c5d7>] ptregscall_common+0x67/0xb0
>
> Useful info, thank you; even more useful was the Code line in your
> attachment
>
> > [ 269.224557] Code: 00 00 48 b9 00 00 00 00 00 e2 ff ff 48 21 d8 48 c1
> > e8 0c 48 8d 14 c5 00 00 00 00 48 c1 e0 06 48 29 d0 48 01 c1 0f 84 10 ff
> > ff ff <48> 8b 01 48 89 ca f6 c4 40 74 04 48 8b 51 10 90 ff 42 08 90 ff
>
> which is enough to identify the oops as in copy_one_pte's get_page(page).
> Here's a patch I think we need, which I'm hoping will fix both your
> crashes - please let us know - thanks a lot.
>
> Stop mprotect's pte_modify from wiping out the x86 pte_special bit, which
> caused oops thereafter when vm_normal_page thought X's abnormal was normal.
>
> Signed-off-by: Hugh Dickins <hugh[at]veritas.com>

(back from vaccation)

Great, thanks Hugh. Always nice to read the fix before the bug report :)
powerpc should need it too, as well as an equivalent for s390.

Thanks,
Nick

> ---
> Fix to 2.6.26-rc8-mm1 x86-implement-pte_special.patch
> Perhaps something similar needed for powerpc? Nick will know.
>
> include/asm-x86/pgtable.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- 2.6.26-rc8-mm1/include/asm-x86/pgtable.h 2008-07-03 11:34:55.000000000
> +0100 +++ linux/include/asm-x86/pgtable.h 2008-07-04 20:58:36.000000000
> +0100 @@ -57,7 +57,7 @@
>
> /* Set of bits not changed in pte_modify */
> #define _PAGE_CHG_MASK (PTE_MASK | _PAGE_PCD | _PAGE_PWT | \
> - _PAGE_ACCESSED | _PAGE_DIRTY)
> + _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY)
>
> #define _PAGE_CACHE_MASK (_PAGE_PCD | _PAGE_PWT)
> #define _PAGE_CACHE_WB (0)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nickpiggin at yahoo

Jul 7, 2008, 12:55 AM

Post #7 of 17 (245 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Saturday 05 July 2008 06:29, Hugh Dickins wrote:

> Fix to 2.6.26-rc8-mm1 x86-implement-pte_special.patch
> Perhaps something similar needed for powerpc? Nick will know.

Here is the required fix for powerpc-implement-pte_special.patch
Attachments: powerpc-implement-pte_special-fix.patch (0.82 KB)


nickpiggin at yahoo

Jul 7, 2008, 1:06 AM

Post #8 of 17 (238 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Saturday 05 July 2008 06:29, Hugh Dickins wrote:
> On Fri, 4 Jul 2008, Zan Lynx wrote:
> > Ryan Hope wrote:
> > > I have tested this with 2.6.26-rc5-mm3 and with 2.6.26-rc8 w/ the
> > > get_user_pages_fast patches from -rc5-mm3... Xorg 1.4.99.* will start
> > > to load but hangs at a black screen. At this point, I can not switch
> > > to another tty. When I try pressing ctrl+alt+del the kernel ooopses
> > > and the caps lock led starts to blink. This happens using the nv,
> > > radeon and radeonhd drivers (the nv was tested on another box
> > > obviously). I have also tried to unselect HAVE_GET_USER_PAGES_FAST in
> > > my kernel config but this does not help. I can not figure out where or
> > > what the bug is. I can provide any other info you guys need to figure
> > > this out. Let me know what I can do.
> >
> > I think that I've seen this too on 2.6.26-rc8 on my laptop. It's 64-bit
>
> Your attachment tells us it's actually 2.6.26-rc8-mm1, less of a worry.
>
> > AMD-64 single core (although I run a SMP-alternatives kernel on it) and I
> > was using the nv driver.
> >
> > The symptoms are the same. Everything was working great until I started
> > X. I SSH'd in and got the dmesg after X locked up. The full dmesg and
> > config are attached because Thunderbird is a very stupid program and
> > won't let me paste without wrapping.
> >
> > Here is a bit of what I got though.
> >
> > [ 269.224276] BUG: unable to handle kernel paging request at
> > ffffe20003480000 [ 269.224291] IP: [<ffffffff80297e30>]
> > copy_page_range+0x520/0x760 [ 269.224306] PGD 1102067 PUD 1103067 PMD 0
> > [ 269.224315] Oops: 0000 [1] SMP
> >
> > [ 269.224473] Call Trace:
> > [ 269.224495] [<ffffffff80239a9b>] dup_mm+0x26b/0x3c0
> > [ 269.224507] [<ffffffff8023a84c>] copy_process+0xc2c/0x1210
> > [ 269.224518] [<ffffffff8023aea3>] do_fork+0x73/0x310
> > [ 269.224526] [<ffffffff8024719e>] sys_rt_sigaction+0x8e/0xd0
> > [ 269.224536] [<ffffffff8020c2db>] system_call_after_swapgs+0x7b/0x80
> > [ 269.224542] [<ffffffff8020c5d7>] ptregscall_common+0x67/0xb0
>
> Useful info, thank you; even more useful was the Code line in your
> attachment
>
> > [ 269.224557] Code: 00 00 48 b9 00 00 00 00 00 e2 ff ff 48 21 d8 48 c1
> > e8 0c 48 8d 14 c5 00 00 00 00 48 c1 e0 06 48 29 d0 48 01 c1 0f 84 10 ff
> > ff ff <48> 8b 01 48 89 ca f6 c4 40 74 04 48 8b 51 10 90 ff 42 08 90 ff
>
> which is enough to identify the oops as in copy_one_pte's get_page(page).
> Here's a patch I think we need, which I'm hoping will fix both your
> crashes - please let us know - thanks a lot.
>
> Stop mprotect's pte_modify from wiping out the x86 pte_special bit, which
> caused oops thereafter when vm_normal_page thought X's abnormal was normal.
>
> Signed-off-by: Hugh Dickins <hugh[at]veritas.com>
> ---
> Fix to 2.6.26-rc8-mm1 x86-implement-pte_special.patch
> Perhaps something similar needed for powerpc? Nick will know.
>
> include/asm-x86/pgtable.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- 2.6.26-rc8-mm1/include/asm-x86/pgtable.h 2008-07-03 11:34:55.000000000
> +0100 +++ linux/include/asm-x86/pgtable.h 2008-07-04 20:58:36.000000000
> +0100 @@ -57,7 +57,7 @@
>
> /* Set of bits not changed in pte_modify */
> #define _PAGE_CHG_MASK (PTE_MASK | _PAGE_PCD | _PAGE_PWT | \
> - _PAGE_ACCESSED | _PAGE_DIRTY)
> + _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY)
>
> #define _PAGE_CACHE_MASK (_PAGE_PCD | _PAGE_PWT)
> #define _PAGE_CACHE_WB (0)

I think we need a similar fix for s390 too. If so, then it really should
get into 2.6.26, but this late in the release, I hope an s390 maintainer
might be able to test and verify the fix?

Thanks,
Nick
Attachments: s390-implement-pte_special-fix.patch (0.99 KB)


hugh at veritas

Jul 7, 2008, 3:39 AM

Post #9 of 17 (237 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Mon, 7 Jul 2008, Nick Piggin wrote:
>
> I think we need a similar fix for s390 too. If so, then it really should
> get into 2.6.26, but this late in the release, I hope an s390 maintainer
> might be able to test and verify the fix?

Wow, yes, I hadn't realized s390 is ahead of the game there: glad you're
back to spot that. But yes, we'd prefer maintainer to confirm and push.


[PATCH]] s390: protect _PAGE_SPECIAL bit against mprotect

Stop mprotect's pte_modify from wiping out the s390 pte_special bit, which
caused oops thereafter when vm_normal_page thought X's abnormal was normal.

Signed-off-by: Nick Piggin <npiggin[at]suse.de>
Acked-by: Hugh Dickins <hugh[at]veritas.com>
---
Index: linux-2.6/include/asm-s390/pgtable.h
===================================================================
--- linux-2.6.orig/include/asm-s390/pgtable.h
+++ linux-2.6/include/asm-s390/pgtable.h
@@ -223,6 +223,9 @@ extern char empty_zero_page[PAGE_SIZE];
#define _PAGE_SPECIAL 0x004 /* SW associated with special page */
#define __HAVE_ARCH_PTE_SPECIAL

+/* Set of bits not changed in pte_modify */
+#define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_SPECIAL)
+
/* Six different types of pages. */
#define _PAGE_TYPE_EMPTY 0x400
#define _PAGE_TYPE_NONE 0x401
@@ -681,7 +684,7 @@ static inline void pte_clear(struct mm_s
*/
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
{
- pte_val(pte) &= PAGE_MASK;
+ pte_val(pte) &= _PAGE_CHG_MASK;
pte_val(pte) |= pgprot_val(newprot);
return pte;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


schwidefsky at googlemail

Jul 7, 2008, 4:08 AM

Post #10 of 17 (237 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Mon, Jul 7, 2008 at 12:39 PM, Hugh Dickins <hugh[at]veritas.com> wrote:
> On Mon, 7 Jul 2008, Nick Piggin wrote:
>>
>> I think we need a similar fix for s390 too. If so, then it really should
>> get into 2.6.26, but this late in the release, I hope an s390 maintainer
>> might be able to test and verify the fix?
>
> Wow, yes, I hadn't realized s390 is ahead of the game there: glad you're
> back to spot that. But yes, we'd prefer maintainer to confirm and push.
>
>
> [PATCH]] s390: protect _PAGE_SPECIAL bit against mprotect
>
> Stop mprotect's pte_modify from wiping out the s390 pte_special bit, which
> caused oops thereafter when vm_normal_page thought X's abnormal was normal.
>
> Signed-off-by: Nick Piggin <npiggin[at]suse.de>
> Acked-by: Hugh Dickins <hugh[at]veritas.com>

s390 will definitely need this. Loosing the pte special bit is not
good. I'll prepate
a please pull right away. Thanks of pointing this out.

--
blue skies,
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


nickpiggin at yahoo

Jul 7, 2008, 4:39 AM

Post #11 of 17 (232 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Monday 07 July 2008 20:39, Hugh Dickins wrote:
> On Mon, 7 Jul 2008, Nick Piggin wrote:
> > I think we need a similar fix for s390 too. If so, then it really should
> > get into 2.6.26, but this late in the release, I hope an s390 maintainer
> > might be able to test and verify the fix?
>
> Wow, yes, I hadn't realized s390 is ahead of the game there: glad you're
> back to spot that. But yes, we'd prefer maintainer to confirm and push.
>
>
> [PATCH]] s390: protect _PAGE_SPECIAL bit against mprotect
>
> Stop mprotect's pte_modify from wiping out the s390 pte_special bit, which
> caused oops thereafter when vm_normal_page thought X's abnormal was normal.
>
> Signed-off-by: Nick Piggin <npiggin[at]suse.de>
> Acked-by: Hugh Dickins <hugh[at]veritas.com>

Thanks, I feel silly to take the authorship of this before your x86
version gets in (and will likely not be credited if it is folded
before merging)

Martin, could you please credit Hugh for the debugging? :)

Thanks,

> ---
> Index: linux-2.6/include/asm-s390/pgtable.h
> ===================================================================
> --- linux-2.6.orig/include/asm-s390/pgtable.h
> +++ linux-2.6/include/asm-s390/pgtable.h
> @@ -223,6 +223,9 @@ extern char empty_zero_page[PAGE_SIZE];
> #define _PAGE_SPECIAL 0x004 /* SW associated with special page */
> #define __HAVE_ARCH_PTE_SPECIAL
>
> +/* Set of bits not changed in pte_modify */
> +#define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_SPECIAL)
> +
> /* Six different types of pages. */
> #define _PAGE_TYPE_EMPTY 0x400
> #define _PAGE_TYPE_NONE 0x401
> @@ -681,7 +684,7 @@ static inline void pte_clear(struct mm_s
> */
> static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
> {
> - pte_val(pte) &= PAGE_MASK;
> + pte_val(pte) &= _PAGE_CHG_MASK;
> pte_val(pte) |= pgprot_val(newprot);
> return pte;
> }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


hugh at veritas

Jul 7, 2008, 5:02 AM

Post #12 of 17 (232 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Mon, 7 Jul 2008, Nick Piggin wrote:
> On Monday 07 July 2008 20:39, Hugh Dickins wrote:
> > On Mon, 7 Jul 2008, Nick Piggin wrote:
> > > I think we need a similar fix for s390 too. If so, then it really should
> > > get into 2.6.26, but this late in the release, I hope an s390 maintainer
> > > might be able to test and verify the fix?
> >
> > Wow, yes, I hadn't realized s390 is ahead of the game there: glad you're
> > back to spot that. But yes, we'd prefer maintainer to confirm and push.
> >
> >
> > [PATCH]] s390: protect _PAGE_SPECIAL bit against mprotect
> >
> > Stop mprotect's pte_modify from wiping out the s390 pte_special bit, which
> > caused oops thereafter when vm_normal_page thought X's abnormal was normal.
> >
> > Signed-off-by: Nick Piggin <npiggin[at]suse.de>
> > Acked-by: Hugh Dickins <hugh[at]veritas.com>
>
> Thanks, I feel silly to take the authorship of this before your x86
> version gets in (and will likely not be credited if it is folded
> before merging)
>
> Martin, could you please credit Hugh for the debugging? :)

Oh, I'm more anxious for your fix to get to Linus than for credit
to be fairly apportioned - we don't need an Oscar ceremony here!

But if Martin does choose to add credits, then it's Ryan's and
Zan's x86 reports that should be credited - they did the hard work.
I didn't mention them in the x86 patch because at that stage I was
just making a wild guess that this might be the cause of their problems.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


cotte at de

Jul 7, 2008, 8:43 AM

Post #13 of 17 (223 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

Nick Piggin wrote:
> I think we need a similar fix for s390 too. If so, then it really should
> get into 2.6.26, but this late in the release, I hope an s390 maintainer
> might be able to test and verify the fix?
I've done my best to combine mprotect, mmap and munmap a MAP_PRIVATE
mapping on a xip file system. The system runs stable with and without
this patch. Could someone please enlighten me on how to reproduce the
problem so that I can verify the fix?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


rmh3093 at gmail

Jul 7, 2008, 9:16 AM

Post #14 of 17 (224 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

Have you tried running Xorg 1.4.99.905?

On Mon, Jul 7, 2008 at 11:43 AM, Carsten Otte <cotte[at]de.ibm.com> wrote:
> Nick Piggin wrote:
>>
>> I think we need a similar fix for s390 too. If so, then it really should
>> get into 2.6.26, but this late in the release, I hope an s390 maintainer
>> might be able to test and verify the fix?
>
> I've done my best to combine mprotect, mmap and munmap a MAP_PRIVATE mapping
> on a xip file system. The system runs stable with and without this patch.
> Could someone please enlighten me on how to reproduce the problem so that I
> can verify the fix?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


cotte at de

Jul 7, 2008, 9:38 AM

Post #15 of 17 (226 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

Ryan Hope wrote:
> Have you tried running Xorg 1.4.99.905?
I guess you mean the xserver? See, we don't have PCI and thus don't
have any graphics hardware - it's a server platform ;-).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


rmh3093 at gmail

Jul 7, 2008, 10:01 AM

Post #16 of 17 (227 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

Well it was only the new xorg-server that caused the bug for me. I
can't reproduce it else where....

On Mon, Jul 7, 2008 at 12:38 PM, Carsten Otte <cotte[at]de.ibm.com> wrote:
> Ryan Hope wrote:
>>
>> Have you tried running Xorg 1.4.99.905?
>
> I guess you mean the xserver? See, we don't have PCI and thus don't have any
> graphics hardware - it's a server platform ;-).
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


hugh at veritas

Jul 7, 2008, 10:48 AM

Post #17 of 17 (218 views)
Permalink
Re: Lockless/Get_User_Pages_Fast causes Xorg 1.4.99.* to lock [In reply to]

On Mon, 7 Jul 2008, Carsten Otte wrote:
> Nick Piggin wrote:
> > I think we need a similar fix for s390 too. If so, then it really should
> > get into 2.6.26, but this late in the release, I hope an s390 maintainer
> > might be able to test and verify the fix?
>
> I've done my best to combine mprotect, mmap and munmap a MAP_PRIVATE mapping
> on a xip file system. The system runs stable with and without this patch.
> Could someone please enlighten me on how to reproduce the problem so that I
> can verify the fix?

Though it would be more obvious to use a MAP_SHARED mapping, we had that
earlier thread in which it emerged that you're not using shared writable
xip mappings, IIRC. So, sticking to MAP_PRIVATE, I'd expect the following
sequence

ptr = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, xip_fd, 0);
var = *ptr;
mprotect(ptr, PAGE_SIZE, PROT_NONE);
munmap(ptr, PAGE_SIZE);

to do a put_page on a non-existent struct page derived from the pfn:
perhaps corrupting other memory without being noticed? Or if
you have CONFIG_DEBUG_VM=y (that would be a good move), to hit
vm_normal_page's VM_BUG_ON(!pfn_valid(pte_pfn(pte))) before that.

(I think you can just as well use PROT_READ|PROT_WRITE rather than
PROT_NONE there, but in principle mprotect could optimize away that
pte modification - though I think it goes ahead and does it anyway.)

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo[at]vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.