Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

Re: Bug#642154: BUG: unable to handle kernel paging request at ffff8803bb6ad000

 

 

Xen devel RSS feed   Index | Next | Previous | View Threaded


ijc at hellion

Sep 20, 2011, 6:40 AM

Post #1 of 16 (523 views)
Permalink
Re: Bug#642154: BUG: unable to handle kernel paging request at ffff8803bb6ad000

On Tue, 2011-09-20 at 14:20 +0100, Ben Hutchings wrote:
> On Tue, 2011-09-20 at 10:12 +0400, rush wrote:
> > Hi,
> >
> > There are several Not tainted lines in old messages file. There are all of them:
> >
> > Sep 10 22:35:33 xen-dom0 kernel: [24183.985513] Pid: 2605, comm:
> > debootstrap Not tainted 3.0.0-1-amd64 #1 Intel Corporation
> > S1200BTL/S1200BTL
> > Sep 10 22:35:33 xen-dom0 kernel: [24183.985621] RIP:
> > e030:[<ffffffff810106db>] [<ffffffff810106db>]
> > __sanitize_i387_state+0x23/0xe1
>
> Source/disassembly:
>
> void __sanitize_i387_state(struct task_struct *tsk)
> {
> u64 xstate_bv;
> int feature_bit = 0x2;
> struct i387_fxsave_struct *fx = &tsk->thread.fpu.state->fxsave;
> ffffffff810106b8: 48 8b 97 48 04 00 00 mov 0x448(%rdi),%rdx
>
> if (!fx)
> return;
> ffffffff810106bf: 48 85 d2 test %rdx,%rdx
> ffffffff810106c2: 0f 84 d0 00 00 00 je 0xffffffff81010798
>
> BUG_ON(task_thread_info(tsk)->status & TS_USEDFPU);
> ffffffff810106c8: 48 8b 47 08 mov 0x8(%rdi),%rax
> ffffffff810106cc: f6 40 14 01 testb $0x1,0x14(%rax)
> ffffffff810106d0: 74 02 je 0xffffffff810106d4
> ffffffff810106d2: 0f 0b ud2
>
> xstate_bv = tsk->thread.fpu.state->xsave.xsave_hdr.xstate_bv;
> ffffffff810106db: 48 8b b2 00 02 00 00 mov 0x200(%rdx),%rsi
>
> So tsk->thread.fpu.state in RDX seems to be invalid.
>
> > Sep 10 22:35:33 xen-dom0 kernel: [24183.985716] RSP:
> > e02b:ffff8803bd2c5e00 EFLAGS: 00010246
> > Sep 10 22:35:33 xen-dom0 kernel: [24183.985767] RAX: 0000000000000000
> > RBX: 00007fff3d69ecc0 RCX: 0000000000000200
> > Sep 10 22:35:33 xen-dom0 kernel: [24183.985824] RDX: ffff8803be0e8e00
> > RSI: ffff8803bd2c5fd8 RDI: ffff8803bd65aa30
> [...]
>
> RDX looks like a reasonable kernel memory pointer. Given the hostname,
> I assume this kernel is running under Xen. So could this be a
> use-after-free where the freed page has been unmapped for reallocation
> by the hypervisor? Can that happen to arbitrary pages in the dom0
> kernel?

In a modern pvops kernel there is a tendency towards leaving a page of
actual dom0 memory behind in these cases, rather than a hole. A page
with no backing mfn should never be escaping into the "wild" anyway but
it's possible fir a given process to see one if it is doing hypercall
activities, mapping foreign pages etc.

There's been some similar looking threads on xen-devel recently but I
haven't paid attention to the details, list & Konrad CC'd. Full log is
at http://bugs.debian.org/642154.

>
> Ben.
>

--
Ian Campbell

Everybody has something to conceal.
-- Humphrey Bogart


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Sep 22, 2011, 12:00 PM

Post #2 of 16 (519 views)
Permalink
Re: Re: Bug#642154: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

> There's been some similar looking threads on xen-devel recently but I
> haven't paid attention to the details, list & Konrad CC'd. Full log is
> at http://bugs.debian.org/642154.

Does xsave=0 make a difference?

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


jrnieder at gmail

Sep 30, 2011, 7:50 PM

Post #3 of 16 (494 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

Konrad Rzeszutek Wilk wrote:

>> There's been some similar looking threads on xen-devel recently but I
>> haven't paid attention to the details, list & Konrad CC'd. Full log is
>> at http://bugs.debian.org/642154.
>
> Does xsave=0 make a difference?

Cc-ing the reporter. Rush, are you able to reproduce the oops you
mentioned? Does adding noxsave to the kernel command line help?

Thanks,
Jonathan

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


rush1503 at gmail

Oct 1, 2011, 12:01 AM

Post #4 of 16 (489 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

2011/10/1, Jonathan Nieder <jrnieder [at] gmail>:
>
> Cc-ing the reporter. Rush, are you able to reproduce the oops you
> mentioned? Does adding noxsave to the kernel command line help?
>

I'm sorry, i'm not guru in such questions.
Do I need to specify xsave=0 in grub boot options or there is another
way to do it?

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


ijc at hellion

Oct 1, 2011, 2:19 AM

Post #5 of 16 (500 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

On Sat, 2011-10-01 at 11:01 +0400, rush wrote:
> 2011/10/1, Jonathan Nieder <jrnieder [at] gmail>:
> >
> > Cc-ing the reporter. Rush, are you able to reproduce the oops you
> > mentioned? Does adding noxsave to the kernel command line help?
> >
>
> I'm sorry, i'm not guru in such questions.
> Do I need to specify xsave=0 in grub boot options or there is another
> way to do it?

Assuming your system uses grub2 you should add it to GRUB_CMDLINE_LINUX
in /etc/default/grub and then run "update-grub".

Ian.
--
Ian Campbell


Place stamp here.
Attachments: signature.asc (0.82 KB)


rush1503 at gmail

Oct 1, 2011, 10:34 AM

Post #6 of 16 (492 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

Unfortunately, xsave=0 didn't give any effect. Oops is still here.

[ 21.095558] BUG: unable to handle kernel paging request at ffff8803bb7c5000
[ 21.095827] IP: [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1
[ 21.096002] PGD 1604067 PUD 3d82d9067 PMD 3d84b5067 PTE 0
[ 21.096355] Oops: 0000 [#1] SMP
[ 21.096578] CPU 3
[ 21.096646] Modules linked in: bridge stp xen_evtchn xenfs loop
snd_pcm snd_timer snd i2c_i801 soundcore snd_page_alloc i2c_core
pcspkr evdev joydev ghes video hed button processor ext4 mbcache jbd2
crc16 dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif ahci libahci
libata scsi_mod ehci_hcd fan thermal usbcore thermal_sys e1000e [last
unloaded: scsi_wait_scan]
[ 21.099742]
[ 21.099835] Pid: 1207, comm: update-exim4.co Not tainted
3.0.0-1-amd64 #1 Intel Corporation S1200BTL/S1200BTL
[ 21.100169] RIP: e030:[<ffffffff810106db>] [<ffffffff810106db>]
__sanitize_i387_state+0x23/0xe1
[ 21.100376] RSP: e02b:ffff8803bbc77e00 EFLAGS: 00010246
[ 21.100481] RAX: 0000000000000000 RBX: 00007fffa6c18700 RCX: 0000000000000200
[ 21.100593] RDX: ffff8803bb7c4e00 RSI: ffff8803bbc77fd8 RDI: ffff8803bbd2ce20
[ 21.100705] RBP: ffff8803bbd2ce20 R08: dead000000200200 R09: ffff8803bda8c2d8
[ 21.100817] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 21.100929] R13: ffffffffffffffff R14: ffff8803bbd2ce20 R15: 00007fffa6c18700
[ 21.101043] FS: 00007f57dae43700(0000) GS:ffff8803d61a0000(0000)
knlGS:0000000000000000
[ 21.101185] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 21.101293] CR2: ffff8803bb7c5000 CR3: 00000003bdcce000 CR4: 0000000000002660
[ 21.101405] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 21.101516] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 21.101629] Process update-exim4.co (pid: 1207, threadinfo
ffff8803bbc76000, task ffff8803bbd2ce20)
[ 21.101774] Stack:
[ 21.101868] ffffffff81010919 0000000000000001 ffff8803bbc77f58
0000000000000011
[ 21.102254] ffff8803bbd2d2b0 ffffffffffffffff ffffffff81008fdd
00000000000002d8
[ 21.102642] 00007fffa6c18538 0000000000000011 0000000000040001
00000065000005d8
[ 21.103029] Call Trace:
[ 21.103128] [<ffffffff81010919>] ? save_i387_xstate+0x102/0x1f3
[ 21.103238] [<ffffffff81008fdd>] ? do_signal+0x212/0x649
[ 21.103346] [<ffffffff81009450>] ? do_notify_resume+0x25/0x6b
[ 21.103458] [<ffffffff8133bfe0>] ? int_signal+0x12/0x17
[ 21.103564] Code: e8 13 2a ff ff 66 90 c3 48 8b 97 48 04 00 00 48
85 d2 0f 84 d0 00 00 00 48 8b 47 08 f6 40 14 01 74 02 0f 0b 48 8b 05
45 4e 71 00
[ 21.106353] 8b b2 00 02 00 00 48 89 c1 48 21 f1 48 39 c1 0f 84 a7 00 00
[ 21.107848] RIP [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1
[ 21.108020] RSP <ffff8803bbc77e00>
[ 21.108119] CR2: ffff8803bb7c5000
[ 21.108218] ---[ end trace f589986fb387a3c2 ]---
[ 22.776339] BUG: unable to handle kernel paging request at ffff8803bb7c5000
[ 22.776579] IP: [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1
[ 22.776754] PGD 1604067 PUD 3d82d9067 PMD 3d84b5067 PTE 0
[ 22.777109] Oops: 0000 [#5] SMP
[ 22.777332] CPU 3
[ 22.777399] Modules linked in: bridge stp xen_evtchn xenfs loop
snd_pcm snd_timer snd i2c_i801 soundcore snd_page_alloc i2c_core
pcspkr evdev joydev ghes video hed button processor ext4 mbcache jbd2
crc16 dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif ahci libahci
libata scsi_mod ehci_hcd fan thermal usbcore thermal_sys e1000e [last
unloaded: scsi_wait_scan]
[ 22.780506]
[ 22.780600] Pid: 2070, comm: forks Tainted: G D
3.0.0-1-amd64 #1 Intel Corporation S1200BTL/S1200BTL
[ 22.780933] RIP: e030:[<ffffffff810106db>] [<ffffffff810106db>]
__sanitize_i387_state+0x23/0xe1
[ 22.781141] RSP: e02b:ffff8803bc403e00 EFLAGS: 00010246
[ 22.781247] RAX: 0000000000000000 RBX: 00007fff10bfcdc0 RCX: 0000000000000200
[ 22.781359] RDX: ffff8803bb7c4e00 RSI: ffff8803bc403fd8 RDI: ffff8803bbd2ce20
[ 22.781472] RBP: ffff8803bbd2ce20 R08: ffff8803bc402000 R09: ffffffff81684640
[ 22.781584] R10: 00007f2f327999d0 R11: 0000000000000246 R12: 0000000000000000
[ 22.781696] R13: ffffffffffffffff R14: ffff8803bbd2ce20 R15: 00007fff10bfcdc0
[ 22.781809] FS: 00007f2f32799700(0000) GS:ffff8803d61a0000(0000)
knlGS:0000000000000000
[ 22.781952] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 22.782059] CR2: ffff8803bb7c5000 CR3: 00000003b7d28000 CR4: 0000000000002660
[ 22.782170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 22.782283] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 22.782395] Process forks (pid: 2070, threadinfo ffff8803bc402000,
task ffff8803bbd2ce20)
[ 22.782537] Stack:
[ 22.782634] ffffffff81010919 0000000000413201 ffff8803bc403f58
0000000000000011
[ 22.783021] ffff8803bbd2d2b0 ffffffffffffffff ffffffff81008fdd
0000000000000000
[ 22.783408] 00007fff10bfcbf8 0000000000000011 0000000000040001
0000fffe00000817
[ 22.783795] Call Trace:
[ 22.783895] [<ffffffff81010919>] ? save_i387_xstate+0x102/0x1f3
[ 22.784004] [<ffffffff81008fdd>] ? do_signal+0x212/0x649
[ 22.784112] [<ffffffff8133733a>] ? error_exit+0x2a/0x60
[ 22.784219] [<ffffffff81336e61>] ? retint_restore_args+0x5/0x6
[ 22.784328] [<ffffffff8100122a>] ? hypercall_page+0x22a/0x1000
[ 22.784436] [<ffffffff81009450>] ? do_notify_resume+0x25/0x6b
[ 22.784545] [<ffffffff8133bfe0>] ? int_signal+0x12/0x17
[ 22.784652] Code: e8 13 2a ff ff 66 90 c3 48 8b 97 48 04 00 00 48
85 d2 0f 84 d0 00 00 00 48 8b 47 08 f6 40 14 01 74 02 0f 0b 48 8b 05
45 4e 71 00
[ 22.787445] 8b b2 00 02 00 00 48 89 c1 48 21 f1 48 39 c1 0f 84 a7 00 00
[ 22.788942] RIP [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1
[ 22.789115] RSP <ffff8803bc403e00>
[ 22.789215] CR2: ffff8803bb7c5000
[ 22.789315] ---[ end trace f589986fb387a3c6 ]---

grub boot options was:

menuentry 'Debian GNU/Linux, with Xen 4.0-amd64 and Linux
3.0.0-1-amd64' --class debian --class gnu-linux --class gnu --class os
--class xen {
insmod raid
insmod mdraid1x
insmod lvm
insmod part_msdos
insmod part_msdos
insmod ext2
set root='(xen-system)'
search --no-floppy --fs-uuid --set=root
709c172b-19b2-417d-8a43-e1957bcdc2f6
echo 'Loading Xen 4.0-amd64 ...'
multiboot /boot/xen-4.0-amd64.gz placeholder
echo 'Loading Linux 3.0.0-1-amd64 ...'
module /boot/vmlinuz-3.0.0-1-amd64 placeholder
root=/dev/mapper/xen-system ro xsave=0 quiet
echo 'Loading initial ramdisk ...'
module /boot/initrd.img-3.0.0-1-amd64
}

Rush.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Oct 3, 2011, 11:47 AM

Post #7 of 16 (493 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

> echo 'Loading Xen 4.0-amd64 ...'
> multiboot /boot/xen-4.0-amd64.gz placeholder

Oops. I meant to try it in the hypervisor - so right after placeholder add "xsave=0"

> echo 'Loading Linux 3.0.0-1-amd64 ...'
> module /boot/vmlinuz-3.0.0-1-amd64 placeholder
> root=/dev/mapper/xen-system ro xsave=0 quiet



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


ijc at hellion

Oct 3, 2011, 11:53 AM

Post #8 of 16 (500 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

On Mon, 2011-10-03 at 14:47 -0400, Konrad Rzeszutek Wilk wrote:
> > echo 'Loading Xen 4.0-amd64 ...'
> > multiboot /boot/xen-4.0-amd64.gz placeholder
>
> Oops. I meant to try it in the hypervisor - so right after placeholder add "xsave=0"

Which in grub2 means add GRUB_CMDLINE_XEN="xsave=0" to /etc/default/grub
(there is no commented out example in this case) and re-run update-grub.

Ian.

--
Ian Campbell


Many a bum show has been saved by the flag.
-- George M. Cohan
Attachments: signature.asc (0.82 KB)


rush1503 at gmail

Oct 7, 2011, 11:13 PM

Post #9 of 16 (495 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

OK, I tried it again, but Oops didn't gone.

menuentry 'Debian GNU/Linux, with Xen 4.0-amd64 and Linux
3.0.0-1-amd64' --class debian --class gnu-linux --class gnu --class os
--class xen {
insmod raid
insmod mdraid1x
insmod lvm
insmod part_msdos
insmod part_msdos
insmod ext2
set root='(xen-system)'
search --no-floppy --fs-uuid --set=root
709c172b-19b2-417d-8a43-e1957bcdc2f6
echo 'Loading Xen 4.0-amd64 ...'
multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
echo 'Loading Linux 3.0.0-1-amd64 ...'
module /boot/vmlinuz-3.0.0-1-amd64 placeholder
root=/dev/mapper/xen-system ro quiet
echo 'Loading initial ramdisk ...'
module /boot/initrd.img-3.0.0-1-amd64
}

Was it right?

[ 24.242539] BUG: unable to handle kernel paging request at ffff8803be1ab000
[ 24.242780] IP: [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1
[ 24.242956] PGD 1604067 PUD 3d82d9067 PMD 3d84ca067 PTE 0
[ 24.243309] Oops: 0000 [#1] SMP
[ 24.243533] CPU 0
[ 24.243601] Modules linked in: xt_tcpudp xt_physdev iptable_filter
ip_tables x_tables xen_netback xen_blkback bridge stp xen_evtchn xenfs
loop snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 joydev
evdev pcspkr ghes i2c_core video processor hed button ext4 mbcache
jbd2 crc16 dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif ahci
libahci libata scsi_mod ehci_hcd fan thermal usbcore thermal_sys
e1000e [last unloaded: scsi_wait_scan]
[ 24.247197]
[ 24.247291] Pid: 2526, comm: forks Not tainted 3.0.0-1-amd64 #1
Intel Corporation S1200BTL/S1200BTL
[ 24.247621] RIP: e030:[<ffffffff810106db>] [<ffffffff810106db>]
__sanitize_i387_state+0x23/0xe1
[ 24.247829] RSP: e02b:ffff88034862be00 EFLAGS: 00010246
[ 24.247935] RAX: 0000000000000000 RBX: 00007fff1755a8c0 RCX: 0000000000000200
[ 24.248047] RDX: ffff8803be1aae00 RSI: ffff88034862bfd8 RDI: ffff8803bbf55650
[ 24.248159] RBP: ffff8803bbf55650 R08: ffff88034862a000 R09: ffffffff81684640
[ 24.248271] R10: 00007fe2b7cd09d0 R11: 0000000000000246 R12: 0000000000000000
[ 24.248384] R13: ffffffffffffffff R14: ffff8803bbf55650 R15: 00007fff1755a8c0
[ 24.248498] FS: 00007fe2b7cd0700(0000) GS:ffff8803d614f000(0000)
knlGS:0000000000000000
[ 24.248641] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 24.248750] CR2: ffff8803be1ab000 CR3: 00000003bc23c000 CR4: 0000000000002660
[ 24.248862] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 24.248976] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 24.249088] Process forks (pid: 2526, threadinfo ffff88034862a000,
task ffff8803bbf55650)
[ 24.249231] Stack:
[ 24.249325] ffffffff81010919 0000000000413201 ffff88034862bf58
0000000000000011
[ 24.249715] ffff8803bbf55ae0 ffffffffffffffff ffffffff81008fdd
0000000000000000
[ 24.250101] 00007fff1755a6f8 0000000000000011 0000000000040001
0000fffe000009df
[ 24.250491] Call Trace:
[ 24.250589] [<ffffffff81010919>] ? save_i387_xstate+0x102/0x1f3
[ 24.250700] [<ffffffff81008fdd>] ? do_signal+0x212/0x649
[ 24.250810] [<ffffffff8133733a>] ? error_exit+0x2a/0x60
[ 24.250916] [<ffffffff81009450>] ? do_notify_resume+0x25/0x6b
[ 24.251027] [<ffffffff8133bfe0>] ? int_signal+0x12/0x17
[ 24.251132] Code: e8 13 2a ff ff 66 90 c3 48 8b 97 48 04 00 00 48
85 d2 0f 84 d0 00 00 00 48 8b 47 08 f6 40 14 01 74 02 0f 0b 48 8b 05
45 4e 71 00
[ 24.253911] 8b b2 00 02 00 00 48 89 c1 48 21 f1 48 39 c1 0f 84 a7 00 00
[ 24.255408] RIP [<ffffffff810106db>] __sanitize_i387_state+0x23/0xe1
[ 24.255581] RSP <ffff88034862be00>
[ 24.255680] CR2: ffff8803be1ab000
[ 24.255780] ---[ end trace e9c161e4e81bf087 ]---

2011/10/3, Ian Campbell <ijc [at] hellion>:
> On Mon, 2011-10-03 at 14:47 -0400, Konrad Rzeszutek Wilk wrote:
>> > echo 'Loading Xen 4.0-amd64 ...'
>> > multiboot /boot/xen-4.0-amd64.gz placeholder
>>
>> Oops. I meant to try it in the hypervisor - so right after placeholder add
>> "xsave=0"
>
> Which in grub2 means add GRUB_CMDLINE_XEN="xsave=0" to /etc/default/grub
> (there is no commented out example in this case) and re-run update-grub.
>
> Ian.
>
> --
> Ian Campbell
>
>
> Many a bum show has been saved by the flag.
> -- George M. Cohan
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Oct 10, 2011, 9:49 AM

Post #10 of 16 (484 views)
Permalink
Re: Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:
> OK, I tried it again, but Oops didn't gone.
.. snip..
> echo 'Loading Xen 4.0-amd64 ...'
> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
.. snip..
> Was it right?

Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel
folks to get the xsave part right and I remember seeing this error about a
year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes that
ultimately went in 4.1.1 did not get ported over to 4.0 and you are just
hitting that.

Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
the testing and try with the xsave (or without) and see if it works?

<holds his fingers hoping it is the xsave feature>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


rush1503 at gmail

Oct 10, 2011, 2:11 PM

Post #11 of 16 (480 views)
Permalink
Re: Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

2011/10/10, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle>:
>
> Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
> the testing and try with the xsave (or without) and see if it works?
>
Ok, but I need around a week for it. (some difficulties with access to
this server at the moment).

> <holds his fingers hoping it is the xsave feature>

Thank you (:

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at suse

Oct 11, 2011, 12:07 AM

Post #12 of 16 (479 views)
Permalink
Re: Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

>>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:
>> OK, I tried it again, but Oops didn't gone.
> .. snip..
>> echo 'Loading Xen 4.0-amd64 ...'
>> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
> .. snip..
>> Was it right?
>
> Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel
> folks to get the xsave part right and I remember seeing this error about a
> year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes that
> ultimately went in 4.1.1 did not get ported over to 4.0 and you are just
> hitting that.
>
> Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
> the testing and try with the xsave (or without) and see if it works?
>
> <holds his fingers hoping it is the xsave feature>

Are both of you certain this isn't the problem of the kernel only
looking at the xsaveopt feature flag (implying that this means
xsave is also available)? I found it necessary to force-clear that
flag in the kernel when OSXSAVE is not set (by calling
x86_xsave_setup() when !cpu_has_xsave, which in turn was
modified to look at X86_FEATURE_OSXSAVE rather than
X86_FEATURE_XSAVE under Xen - all of which I'm afraid would
need to be done differently in pv-ops).

If it is, the problem could be worked around by *en*abling xsave
in Xen (which is off by default prior to 4.2), assuming none of the
incomplete functionality would cause other headaches.

But yes, the CPUID handling code in 4.1.1 should properly hide
XSAVEOPT when XSAVE is disabled, so just using this version
ought to also get things going.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


ijc at hellion

Oct 11, 2011, 1:02 AM

Post #13 of 16 (475 views)
Permalink
Re: Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote:
> >>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:
> >> OK, I tried it again, but Oops didn't gone.
> > .. snip..
> >> echo 'Loading Xen 4.0-amd64 ...'
> >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
> > .. snip..
> >> Was it right?
> >
> > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel
> > folks to get the xsave part right and I remember seeing this error about a
> > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes that
> > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just
> > hitting that.
> >
> > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
> > the testing and try with the xsave (or without) and see if it works?
> >
> > <holds his fingers hoping it is the xsave feature>
>
> Are both of you certain this isn't the problem of the kernel only
> looking at the xsaveopt feature flag (implying that this means
> xsave is also available)? I found it necessary to force-clear that
> flag in the kernel when OSXSAVE is not set (by calling
> x86_xsave_setup() when !cpu_has_xsave, which in turn was
> modified to look at X86_FEATURE_OSXSAVE rather than
> X86_FEATURE_XSAVE under Xen - all of which I'm afraid would
> need to be done differently in pv-ops).

That all sounds familiar... In mainline we have (in
xen_init_cpuid_mask):

...
xsave_mask =
(1 << (X86_FEATURE_XSAVE % 32)) |
(1 << (X86_FEATURE_OSXSAVE % 32));

/* Xen will set CR4.OSXSAVE if supported and not disabled by force */
if ((cx & xsave_mask) != xsave_mask)
cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE */

Which I think implements something similar to what you describe? IOW
unless both XSAVE and OSXSAVE are available both are forcibly disabled.

While grepping I noticed that the kernel command line parameter to
disable xsave appears to be "noxsave" rather than "xsave=0", Rush is
that something you could try? (GRUB_CMDLINE_LINUX is the place to add
it)

Ian.

> If it is, the problem could be worked around by *en*abling xsave
> in Xen (which is off by default prior to 4.2), assuming none of the
> incomplete functionality would cause other headaches.
>
> But yes, the CPUID handling code in 4.1.1 should properly hide
> XSAVEOPT when XSAVE is disabled, so just using this version
> ought to also get things going.
>
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel

--
Ian Campbell
Current Noise: Zyklon - Hammer Revelation

The ultimate game show will be the one where somebody gets killed at the end.
-- Chuck Barris, creator of "The Gong Show"


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at suse

Oct 11, 2011, 1:36 AM

Post #14 of 16 (477 views)
Permalink
Re: Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

>>> On 11.10.11 at 10:02, Ian Campbell <ijc [at] hellion> wrote:
> On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote:
>> >>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
>> > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:
>> >> OK, I tried it again, but Oops didn't gone.
>> > .. snip..
>> >> echo 'Loading Xen 4.0-amd64 ...'
>> >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
>> > .. snip..
>> >> Was it right?
>> >
>> > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel
>> > folks to get the xsave part right and I remember seeing this error about a
>> > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes
> that
>> > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just
>> > hitting that.
>> >
>> > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
>> > the testing and try with the xsave (or without) and see if it works?
>> >
>> > <holds his fingers hoping it is the xsave feature>
>>
>> Are both of you certain this isn't the problem of the kernel only
>> looking at the xsaveopt feature flag (implying that this means
>> xsave is also available)? I found it necessary to force-clear that
>> flag in the kernel when OSXSAVE is not set (by calling
>> x86_xsave_setup() when !cpu_has_xsave, which in turn was
>> modified to look at X86_FEATURE_OSXSAVE rather than
>> X86_FEATURE_XSAVE under Xen - all of which I'm afraid would
>> need to be done differently in pv-ops).
>
> That all sounds familiar... In mainline we have (in
> xen_init_cpuid_mask):
>
> ...
> xsave_mask =
> (1 << (X86_FEATURE_XSAVE % 32)) |
> (1 << (X86_FEATURE_OSXSAVE % 32));
>
> /* Xen will set CR4.OSXSAVE if supported and not disabled by force
> */
> if ((cx & xsave_mask) != xsave_mask)
> cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE &
> OSXSAVE */
>
> Which I think implements something similar to what you describe? IOW
> unless both XSAVE and OSXSAVE are available both are forcibly disabled.

Apart from the need to disable XSAVEOPT, yes.

> While grepping I noticed that the kernel command line parameter to
> disable xsave appears to be "noxsave" rather than "xsave=0", Rush is
> that something you could try? (GRUB_CMDLINE_LINUX is the place to add
> it)

Or "noxsaveopt" (if that's the problem, i.e. Rush's CPUs have that
capability).

Jan

> Ian.
>
>> If it is, the problem could be worked around by *en*abling xsave
>> in Xen (which is off by default prior to 4.2), assuming none of the
>> incomplete functionality would cause other headaches.
>>
>> But yes, the CPUID handling code in 4.1.1 should properly hide
>> XSAVEOPT when XSAVE is disabled, so just using this version
>> ought to also get things going.
>>
>> Jan
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel [at] lists
>> http://lists.xensource.com/xen-devel
>
> --
> Ian Campbell
> Current Noise: Zyklon - Hammer Revelation
>
> The ultimate game show will be the one where somebody gets killed at the
> end.
> -- Chuck Barris, creator of "The Gong Show"




_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


ijc at hellion

Oct 11, 2011, 1:43 AM

Post #15 of 16 (479 views)
Permalink
Re: Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

On Tue, 2011-10-11 at 09:36 +0100, Jan Beulich wrote:
> >>> On 11.10.11 at 10:02, Ian Campbell <ijc [at] hellion> wrote:
> > On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote:
> >> >>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> >> > On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:
> >> >> OK, I tried it again, but Oops didn't gone.
> >> > .. snip..
> >> >> echo 'Loading Xen 4.0-amd64 ...'
> >> >> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
> >> > .. snip..
> >> >> Was it right?
> >> >
> >> > Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel
> >> > folks to get the xsave part right and I remember seeing this error about a
> >> > year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes
> > that
> >> > ultimately went in 4.1.1 did not get ported over to 4.0 and you are just
> >> > hitting that.
> >> >
> >> > Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
> >> > the testing and try with the xsave (or without) and see if it works?
> >> >
> >> > <holds his fingers hoping it is the xsave feature>
> >>
> >> Are both of you certain this isn't the problem of the kernel only
> >> looking at the xsaveopt feature flag (implying that this means
> >> xsave is also available)? I found it necessary to force-clear that
> >> flag in the kernel when OSXSAVE is not set (by calling
> >> x86_xsave_setup() when !cpu_has_xsave, which in turn was
> >> modified to look at X86_FEATURE_OSXSAVE rather than
> >> X86_FEATURE_XSAVE under Xen - all of which I'm afraid would
> >> need to be done differently in pv-ops).
> >
> > That all sounds familiar... In mainline we have (in
> > xen_init_cpuid_mask):
> >
> > ...
> > xsave_mask =
> > (1 << (X86_FEATURE_XSAVE % 32)) |
> > (1 << (X86_FEATURE_OSXSAVE % 32));
> >
> > /* Xen will set CR4.OSXSAVE if supported and not disabled by force
> > */
> > if ((cx & xsave_mask) != xsave_mask)
> > cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE &
> > OSXSAVE */
> >
> > Which I think implements something similar to what you describe? IOW
> > unless both XSAVE and OSXSAVE are available both are forcibly disabled.
>
> Apart from the need to disable XSAVEOPT, yes.

Oh, right, I hadn't noticed it was a different/third flag.

> > While grepping I noticed that the kernel command line parameter to
> > disable xsave appears to be "noxsave" rather than "xsave=0", Rush is
> > that something you could try? (GRUB_CMDLINE_LINUX is the place to add
> > it)
>
> Or "noxsaveopt" (if that's the problem, i.e. Rush's CPUs have that
> capability).

Right, Rush can you try both "noxsave" and "noxsaveopt" independently
please. If those work then we need to update the above logic to mask
xsaveopt as well.

Thanks,
Ian.

>
> Jan
>
> > Ian.
> >
> >> If it is, the problem could be worked around by *en*abling xsave
> >> in Xen (which is off by default prior to 4.2), assuming none of the
> >> incomplete functionality would cause other headaches.
> >>
> >> But yes, the CPUID handling code in 4.1.1 should properly hide
> >> XSAVEOPT when XSAVE is disabled, so just using this version
> >> ought to also get things going.
> >>
> >> Jan
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel [at] lists
> >> http://lists.xensource.com/xen-devel
> >
> > --
> > Ian Campbell
> > Current Noise: Zyklon - Hammer Revelation
> >
> > The ultimate game show will be the one where somebody gets killed at the
> > end.
> > -- Chuck Barris, creator of "The Gong Show"
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xensource.com/xen-devel

--
Ian Campbell
Current Noise: Zyklon - Transcendental War - Battle Between Gods

If you tell the truth you don't have to remember anything.
-- Mark Twain


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


xen.list at daevel

Mar 6, 2012, 12:22 PM

Post #16 of 16 (389 views)
Permalink
Re: BUG: unable to handle kernel paging request at ffff8803bb6ad000 [In reply to]

On 11/10/2011 10:43, Ian Campbell wrote:
> On Tue, 2011-10-11 at 09:36 +0100, Jan Beulich wrote:
>>>>> On 11.10.11 at 10:02, Ian Campbell<ijc [at] hellion> wrote:
>>> On Tue, 2011-10-11 at 08:07 +0100, Jan Beulich wrote:
>>>>>>> On 10.10.11 at 18:49, Konrad Rzeszutek Wilk<konrad.wilk [at] oracle> wrote:
>>>>> On Sat, Oct 08, 2011 at 10:13:14AM +0400, rush wrote:
>>>>>> OK, I tried it again, but Oops didn't gone.
>>>>> .. snip..
>>>>>> echo 'Loading Xen 4.0-amd64 ...'
>>>>>> multiboot /boot/xen-4.0-amd64.gz placeholder xsave=0
>>>>> .. snip..
>>>>>> Was it right?
>>>>>
>>>>> Yup. I think.. this is a bit embarrassing. It took a bit of time for Intel
>>>>> folks to get the xsave part right and I remember seeing this error about a
>>>>> year ago with xsave on a Dell Optiplex 780. Hence I wonder if the fixes
>>> that
>>>>> ultimately went in 4.1.1 did not get ported over to 4.0 and you are just
>>>>> hitting that.
>>>>>
>>>>> Can I ask you to do one more thing? Can you upgrade to the xen-4.1.1 in
>>>>> the testing and try with the xsave (or without) and see if it works?
>>>>>
>>>>> <holds his fingers hoping it is the xsave feature>
>>>>
>>>> Are both of you certain this isn't the problem of the kernel only
>>>> looking at the xsaveopt feature flag (implying that this means
>>>> xsave is also available)? I found it necessary to force-clear that
>>>> flag in the kernel when OSXSAVE is not set (by calling
>>>> x86_xsave_setup() when !cpu_has_xsave, which in turn was
>>>> modified to look at X86_FEATURE_OSXSAVE rather than
>>>> X86_FEATURE_XSAVE under Xen - all of which I'm afraid would
>>>> need to be done differently in pv-ops).
>>>
>>> That all sounds familiar... In mainline we have (in
>>> xen_init_cpuid_mask):
>>>
>>> ...
>>> xsave_mask =
>>> (1<< (X86_FEATURE_XSAVE % 32)) |
>>> (1<< (X86_FEATURE_OSXSAVE % 32));
>>>
>>> /* Xen will set CR4.OSXSAVE if supported and not disabled by force
>>> */
>>> if ((cx& xsave_mask) != xsave_mask)
>>> cpuid_leaf1_ecx_mask&= ~xsave_mask; /* disable XSAVE&
>>> OSXSAVE */
>>>
>>> Which I think implements something similar to what you describe? IOW
>>> unless both XSAVE and OSXSAVE are available both are forcibly disabled.
>>
>> Apart from the need to disable XSAVEOPT, yes.
>
> Oh, right, I hadn't noticed it was a different/third flag.
>
>>> While grepping I noticed that the kernel command line parameter to
>>> disable xsave appears to be "noxsave" rather than "xsave=0", Rush is
>>> that something you could try? (GRUB_CMDLINE_LINUX is the place to add
>>> it)
>>
>> Or "noxsaveopt" (if that's the problem, i.e. Rush's CPUs have that
>> capability).
>
> Right, Rush can you try both "noxsave" and "noxsaveopt" independently
> please. If those work then we need to update the above logic to mask
> xsaveopt as well.
>
> Thanks,
> Ian.
>


For the record, same problem here with Xen 4.0 and an Intel Xeon CPU E31220 with "microcode 0x14".
(and the problem doesn't exists with a CPU E31220 without that microcode).


The xen parameter "noxsaveopt" solved it.

Olivier

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel

Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.