Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

Xen 4.0 crashes with pvops kernel

 

 

Xen devel RSS feed   Index | Next | Previous | View Threaded


cris.daniluk at gmail

Jun 14, 2010, 3:10 PM

Post #1 of 23 (458 views)
Permalink
Xen 4.0 crashes with pvops kernel

Hi,

Reposting this to the xen-devel list after a suggestion by Boris Derzhavets.

I'm trying to get Xen 4.0 going with a pvops-enabled kernel on an IBM
x3500 7797 server. I've tried several different distros, including
CentOS5.5, RHEL6 beta, FC12 and FC13. In each of them, I can run a
Xenlinux (2.6.18) kernel, including the Xen-enabled distro kernels in
CentOS 5.5 and FC12. However, if I try to run a pvops kernel, I get a
panic. The CPUs are detected fine, but it seems to have trouble
shortly thereafter.

I can boot the pvops-enabled kernel directly and everything works
fine. I only have trouble when booting it as a dom0. I've got two
identical servers and it is a problem on both, so I don't think
there's bad RAM. I also tried this with the latest 4.0-testing branch
and had the same experience.

Here's my console output from FC12 with a 2.6.32.15-compiled kernel.
Please let me know what additional debugging info is needed.

ACPI: bus type pci registered
PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 27
PCI: MCFG area at e0000000 reserved in E820
PCI: Using MMCONFIG at e0000000 - e1bfffff
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
ERROR: Unable to locate IOAPIC for GSI 9
ACPI: Interpreter enabled
ACPI: (supports S0 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
(XEN) mm.c:797:d0 Non-privileged (0) attempt to map I/O space 000fec80
BUG: unable to handle kernel paging request at ffffc900001b0000
IP: [<ffffffff81281df4>] acpi_ex_system_memory_space_handler+0x1c6/0x1e6
PGD 3fd5a067 PUD 3fd5b067 PMD 3fd5c067 PTE 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 3
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.32.15 #1 IBM eServer x3500-[7977AC1]-
RIP: e030:[<ffffffff81281df4>] [<ffffffff81281df4>]
acpi_ex_system_memory_space_handler+0x1c6/0x1e6
RSP: e02b:ffff88003ee876c0 EFLAGS: 00010246
RAX: 000000000000002e RBX: ffff88003efc5880 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffff81228a14 RDI: 80000000fec80273
RBP: ffff88003ee87700 R08: ffff880002697220 R09: 0000000000000100
R10: 0000000000000001 R11: ffffea0000dc7708 R12: ffffc900001b0000
R13: 0000000000000000 R14: 0000000000000020 R15: ffff88003ee87848
FS: 0000000000000000(0000) GS:ffff880002685000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffc900001b0000 CR3: 0000000001001000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88003ee86000, task ffff88003ee88000)
Stack:
ffff88003ee876f0 0000000000000100 ffff880000000000 ffff88003fdeeea0
<0> ffffffff81281c2e ffff88003ee11ea0 ffff88003fdeef78 0000000000000000
<0> ffff88003ee87770 ffffffff8127a40e ffff88003ee87720 ffffffff810380f5
Call Trace:
[<ffffffff81281c2e>] ? acpi_ex_system_memory_space_handler+0x0/0x1e6
[<ffffffff8127a40e>] acpi_ev_address_space_dispatch+0x170/0x1be
[<ffffffff810380f5>] ? ioremap_nocache+0x17/0x19
[<ffffffff8127f033>] acpi_ex_access_region+0x235/0x242
[<ffffffff8127a40e>] ? acpi_ev_address_space_dispatch+0x170/0x1be
[<ffffffff8100ee7d>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8127f137>] acpi_ex_field_datum_io+0xf7/0x189
[<ffffffff8100f5ff>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff8127f425>] acpi_ex_write_with_update_rule+0xb5/0xc0
[<ffffffff8127f5ee>] acpi_ex_insert_into_field+0x1be/0x1e0
[<ffffffff8100f5ff>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff8127dab0>] acpi_ex_write_data_to_field+0x1a4/0x1c2
[<ffffffff8128fb5c>] ? acpi_ut_allocate_object_desc_dbg+0x40/0x78
[<ffffffff81281eb7>] acpi_ex_store_object_to_node+0xa3/0xe6
[<ffffffff81278785>] ? acpi_ds_create_operand+0x1f7/0x20a
[<ffffffff812820a6>] acpi_ex_store+0xc3/0x255
[<ffffffff8127fe88>] acpi_ex_opcode_1A_1T_1R+0x361/0x4bc
[<ffffffff812806f2>] ? acpi_ex_resolve_operands+0x1f2/0x4d4
[<ffffffff812773e3>] acpi_ds_exec_end_op+0xef/0x3dc
[<ffffffff81289b9e>] acpi_ps_parse_loop+0x7c0/0x946
[<ffffffff81288c88>] acpi_ps_parse_aml+0x9f/0x2de
[<ffffffff8128a42c>] acpi_ps_execute_method+0x1e9/0x2b9
[<ffffffff8128598a>] acpi_ns_evaluate+0xe6/0x1ad
[<ffffffff8128d957>] acpi_ut_evaluate_object+0xb7/0x1e0
[<ffffffff8100f5ff>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff8128b680>] acpi_rs_get_method_data+0x1f/0x45
[<ffffffff81271e27>] ? get_root_bridge_busnr_callback+0x0/0x40
[<ffffffff8128af7a>] acpi_walk_resources+0x56/0xc9
[<ffffffff81457c63>] acpi_pci_root_add+0x70/0x273
[<ffffffff8126deed>] acpi_device_probe+0x50/0x122
[<ffffffff812ecb1a>] driver_probe_device+0xea/0x217
[<ffffffff812ecca4>] __driver_attach+0x5d/0x81
[<ffffffff812ecc47>] ? __driver_attach+0x0/0x81
[<ffffffff812ebfbe>] bus_for_each_dev+0x53/0x88
[<ffffffff812ec8aa>] driver_attach+0x1e/0x20
[<ffffffff812ec4e9>] bus_add_driver+0xd5/0x23c
[<ffffffff812ecfa4>] driver_register+0x9d/0x10e
[<ffffffff81863968>] ? acpi_pci_root_init+0x0/0x28
[<ffffffff8126e9f2>] acpi_bus_register_driver+0x43/0x45
[<ffffffff81863981>] acpi_pci_root_init+0x19/0x28
[<ffffffff8100a069>] do_one_initcall+0x5e/0x159
[<ffffffff818366bc>] kernel_init+0x165/0x1bf
[<ffffffff81013d2a>] child_rip+0xa/0x20
[<ffffffff81012f11>] ? int_ret_from_sys_call+0x7/0x1b
[<ffffffff8101369d>] ? retint_restore_args+0x5/0x6
[<ffffffff81013d20>] ? child_rip+0x0/0x20
Code: 83 fe 08 75 33 eb 0e 41 83 fe 20 74 1b 41 83 fe 40 75 25 eb 1c
49 8b 07 41 88 04 24 eb 1a 49 8b 07 66 41 89 04 24 eb 10 49 8b 07 <41>
89 04 24 eb 07 49 8b 07 49 89 04 24 31 c0 48 83 c4 18 5b 41
RIP [<ffffffff81281df4>] acpi_ex_system_memory_space_handler+0x1c6/0x1e6
RSP <ffff88003ee876c0>
CR2: ffffc900001b0000
---[ end trace a22d306b065d4a66 ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G D 2.6.32.15 #1
Call Trace:
[<ffffffff81467068>] panic+0x7a/0x133
[<ffffffff810624f6>] ? exit_ptrace+0xa1/0x121
[<ffffffff8105afdd>] do_exit+0x7a/0x6d3
[<ffffffff8146a2df>] oops_end+0xbf/0xc7
[<ffffffff81037831>] no_context+0x1f3/0x202
[<ffffffff8100ed11>] ? xen_set_pte_at+0x37/0x109
[<ffffffff810379bd>] __bad_area_nosemaphore+0x17d/0x1a0
[<ffffffff8100c7bd>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff810379f3>] bad_area_nosemaphore+0x13/0x15
[<ffffffff8146b75e>] do_page_fault+0x14f/0x2a0
[<ffffffff81469775>] page_fault+0x25/0x30
[<ffffffff81228a14>] ? rb_insert_color+0xbc/0xe5
[<ffffffff81281df4>] ? acpi_ex_system_memory_space_handler+0x1c6/0x1e6
[<ffffffff81281c2e>] ? acpi_ex_system_memory_space_handler+0x0/0x1e6
[<ffffffff8127a40e>] acpi_ev_address_space_dispatch+0x170/0x1be
[<ffffffff810380f5>] ? ioremap_nocache+0x17/0x19
[<ffffffff8127f033>] acpi_ex_access_region+0x235/0x242
[<ffffffff8127a40e>] ? acpi_ev_address_space_dispatch+0x170/0x1be
[<ffffffff8100ee7d>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8127f137>] acpi_ex_field_datum_io+0xf7/0x189
[<ffffffff8100f5ff>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff8127f425>] acpi_ex_write_with_update_rule+0xb5/0xc0
[<ffffffff8127f5ee>] acpi_ex_insert_into_field+0x1be/0x1e0
[<ffffffff8100f5ff>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff8127dab0>] acpi_ex_write_data_to_field+0x1a4/0x1c2
[<ffffffff8128fb5c>] ? acpi_ut_allocate_object_desc_dbg+0x40/0x78
[<ffffffff81281eb7>] acpi_ex_store_object_to_node+0xa3/0xe6
[<ffffffff81278785>] ? acpi_ds_create_operand+0x1f7/0x20a
[<ffffffff812820a6>] acpi_ex_store+0xc3/0x255
[<ffffffff8127fe88>] acpi_ex_opcode_1A_1T_1R+0x361/0x4bc
[<ffffffff812806f2>] ? acpi_ex_resolve_operands+0x1f2/0x4d4
[<ffffffff812773e3>] acpi_ds_exec_end_op+0xef/0x3dc
[<ffffffff81289b9e>] acpi_ps_parse_loop+0x7c0/0x946
[<ffffffff81288c88>] acpi_ps_parse_aml+0x9f/0x2de
[<ffffffff8128a42c>] acpi_ps_execute_method+0x1e9/0x2b9
[<ffffffff8128598a>] acpi_ns_evaluate+0xe6/0x1ad
[<ffffffff8128d957>] acpi_ut_evaluate_object+0xb7/0x1e0
[<ffffffff8100f5ff>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff8128b680>] acpi_rs_get_method_data+0x1f/0x45
[<ffffffff81271e27>] ? get_root_bridge_busnr_callback+0x0/0x40
[<ffffffff8128af7a>] acpi_walk_resources+0x56/0xc9
[<ffffffff81457c63>] acpi_pci_root_add+0x70/0x273
[<ffffffff8126deed>] acpi_device_probe+0x50/0x122
[<ffffffff812ecb1a>] driver_probe_device+0xea/0x217
[<ffffffff812ecca4>] __driver_attach+0x5d/0x81
[<ffffffff812ecc47>] ? __driver_attach+0x0/0x81
[<ffffffff812ebfbe>] bus_for_each_dev+0x53/0x88
[<ffffffff812ec8aa>] driver_attach+0x1e/0x20
[<ffffffff812ec4e9>] bus_add_driver+0xd5/0x23c
[<ffffffff812ecfa4>] driver_register+0x9d/0x10e
[<ffffffff81863968>] ? acpi_pci_root_init+0x0/0x28
[<ffffffff8126e9f2>] acpi_bus_register_driver+0x43/0x45
[<ffffffff81863981>] acpi_pci_root_init+0x19/0x28
[<ffffffff8100a069>] do_one_initcall+0x5e/0x159
[<ffffffff818366bc>] kernel_init+0x165/0x1bf
[<ffffffff81013d2a>] child_rip+0xa/0x20
[<ffffffff81012f11>] ? int_ret_from_sys_call+0x7/0x1b
[<ffffffff8101369d>] ? retint_restore_args+0x5/0x6
[<ffffffff81013d20>] ? child_rip+0x0/0x20

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 14, 2010, 11:54 PM

Post #2 of 23 (456 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 00:10, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> Here's my console output from FC12 with a 2.6.32.15-compiled kernel.
> Please let me know what additional debugging info is needed.

You cut off too much from the console output.

> ACPI: bus type pci registered
> PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 27
> PCI: MCFG area at e0000000 reserved in E820
> PCI: Using MMCONFIG at e0000000 - e1bfffff
> PCI: Using configuration type 1 for base access
> bio: create slab <bio-0> at 0
> ERROR: Unable to locate IOAPIC for GSI 9
> ACPI: Interpreter enabled
> ACPI: (supports S0 S4 S5)
> ACPI: Using IOAPIC for interrupt routing
> ACPI: No dock devices found.
> (XEN) mm.c:797:d0 Non-privileged (0) attempt to map I/O space 000fec80

Specifically, it is of great interest to understand what hardware lives
at this page. It may be that the earlier output (including Xen's) helps,
it may also be that you need to find out under a native kernel. I
would guess that it's a secondary IO-APIC that occupies this space.
If so, ACPI's DSDT and SSDT might also be needed to understand
where the access originates from (there's quite a number of BIOSes
where ACPI objects' methods access the IO-APIC space, which isn't
permitted to be accessed by any guest kernel, including Dom0).

> BUG: unable to handle kernel paging request at ffffc900001b0000

This, finally, would hint at missing error checking in the code
paths involved (i.e. a non-Xen specific issue that however doesn't
normally trigger in a native kernel). Fixing this may be the only
thing that helps, short of granting Dom0 access to the IO-APIC
memory space.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 15, 2010, 5:50 AM

Post #3 of 23 (455 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 14:35, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> (XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
> (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
> (XEN) ACPI: IOAPIC (id[0x03] address[0xfec80000] gsi_base[24])
> (XEN) IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47

As suspected, there is an IO-APIC at that address, which Dom0 must
not access.

Booting with ACPI debugging enabled won't get you the needed
information; you'd rather need to use the acpidump utility to obtain
blobs containing the various ACPI tables, out of which the DSDT
(and maybes SSDTs) are what likely contains the problematic uses.

But even if we verify that the references come from some ACPI
method(s), likely the only way to address this is to fix the kernel
side error handling.

Keir, assuming these are reads only, would it make sense to permit
Dom0 to map the IO-APIC space read-only? Perhaps even
transparently converting writeable mappings to read-only ones
(since drivers/acpi/osl.c tries to establish writeable mappings
irrespective of the actual needs)? The obvious danger in doing
so is that going forward there may appear fields in that page
reads of which aren't side effect free...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


keir.fraser at eu

Jun 15, 2010, 5:56 AM

Post #4 of 23 (452 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On 15/06/2010 13:50, "Jan Beulich" <JBeulich [at] novell> wrote:

> But even if we verify that the references come from some ACPI
> method(s), likely the only way to address this is to fix the kernel
> side error handling.
>
> Keir, assuming these are reads only, would it make sense to permit
> Dom0 to map the IO-APIC space read-only? Perhaps even
> transparently converting writeable mappings to read-only ones
> (since drivers/acpi/osl.c tries to establish writeable mappings
> irrespective of the actual needs)? The obvious danger in doing
> so is that going forward there may appear fields in that page
> reads of which aren't side effect free...

Well, how come it works with other Linux kernels -- presumably they have
some extra error handling in the ACPI subsystem? Shouldn't that just be
added to this kernel?

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 15, 2010, 6:20 AM

Post #5 of 23 (452 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 14:56, Keir Fraser <keir.fraser [at] eu> wrote:
> On 15/06/2010 13:50, "Jan Beulich" <JBeulich [at] novell> wrote:
>
>> But even if we verify that the references come from some ACPI
>> method(s), likely the only way to address this is to fix the kernel
>> side error handling.
>>
>> Keir, assuming these are reads only, would it make sense to permit
>> Dom0 to map the IO-APIC space read-only? Perhaps even
>> transparently converting writeable mappings to read-only ones
>> (since drivers/acpi/osl.c tries to establish writeable mappings
>> irrespective of the actual needs)? The obvious danger in doing
>> so is that going forward there may appear fields in that page
>> reads of which aren't side effect free...
>
> Well, how come it works with other Linux kernels -- presumably they have
> some extra error handling in the ACPI subsystem? Shouldn't that just be
> added to this kernel?

I'm rather suspecting there's new code (compared to 2.6.18) that's
lacking proper error handling, though I didn't look in detail so far.

Hmm, looking a little more closely it seems they indeed try to write
to that space - this we for sure can't allow. I'll see if I can follow
the code path (unfortunately the stack trace is an imprecise one).

Cris, seeing DSDT and SSDTs from that system would surely be
helpful.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


cris.daniluk at gmail

Jun 15, 2010, 6:24 AM

Post #6 of 23 (453 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On Tue, Jun 15, 2010 at 9:20 AM, Jan Beulich <JBeulich [at] novell> wrote:
>>>> On 15.06.10 at 14:56, Keir Fraser <keir.fraser [at] eu> wrote:
>> On 15/06/2010 13:50, "Jan Beulich" <JBeulich [at] novell> wrote:
>>
>>>
>>> Keir, assuming these are reads only, would it make sense to permit
>>> Dom0 to map the IO-APIC space read-only? Perhaps even
>>> transparently converting writeable mappings to read-only ones
>>> (since drivers/acpi/osl.c tries to establish writeable mappings
>>> irrespective of the actual needs)? The obvious danger in doing
>>> so is that going forward there may appear fields in that page
>>> reads of which aren't side effect free...
>>
>> Well, how come it works with other Linux kernels -- presumably they have
>> some extra error handling in the ACPI subsystem? Shouldn't that just be
>> added to this kernel?
>
> I'm rather suspecting there's new code (compared to 2.6.18) that's
> lacking proper error handling, though I didn't look in detail so far.
>
> Hmm, looking a little more closely it seems they indeed try to write
> to that space - this we for sure can't allow. I'll see if I can follow
> the code path (unfortunately the stack trace is an imprecise one).
>
> Cris, seeing DSDT and SSDTs from that system would surely be
> helpful.
>

For what its worth, it happened in 2.6.32.11 in addition to 2.6.32.13.
I also had earlier tried a 2.6.31 pvops distro kernel with the same
results last week. Interestingly, I also tried a 2.6.31 kernel with
Xen 3.4 on the same hardware with no issues. It seems like XenLinux
kernel with Xen 4.0 is fine, and pvops with Xen 3.x is fine.

Anyway, what is a useful format to provide those tables in? The dumps
are quite long and seem like they will format horribly via email.

Thanks for your help,

Cris

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 15, 2010, 6:43 AM

Post #7 of 23 (457 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 15:24, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> For what its worth, it happened in 2.6.32.11 in addition to 2.6.32.13.
> I also had earlier tried a 2.6.31 pvops distro kernel with the same
> results last week. Interestingly, I also tried a 2.6.31 kernel with
> Xen 3.4 on the same hardware with no issues. It seems like XenLinux
> kernel with Xen 4.0 is fine, and pvops with Xen 3.x is fine.

That's odd, but perhaps the kernel behaves differently on older Xen.
Attempts to map IO-APIC space should be disallowed by 3.4 just as
with 4.0.

> Anyway, what is a useful format to provide those tables in? The dumps
> are quite long and seem like they will format horribly via email.

Just attach them, perhaps even in compressed form.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


cris.daniluk at gmail

Jun 15, 2010, 6:49 AM

Post #8 of 23 (451 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On Tue, Jun 15, 2010 at 9:43 AM, Jan Beulich <JBeulich [at] novell> wrote:
>>>> On 15.06.10 at 15:24, Cris Daniluk <cris.daniluk [at] gmail> wrote:
>> For what its worth, it happened in 2.6.32.11 in addition to 2.6.32.13.
>> I also had earlier tried a 2.6.31 pvops distro kernel with the same
>> results last week. Interestingly, I also tried a 2.6.31 kernel with
>> Xen 3.4 on the same hardware with no issues. It seems like XenLinux
>> kernel with Xen 4.0 is fine, and pvops with Xen 3.x is fine.
>
> That's odd, but perhaps the kernel behaves differently on older Xen.
> Attempts to map IO-APIC space should be disallowed by 3.4 just as
> with 4.0.
>

SSDT and DSDT dumps attached.

Cris
Attachments: acpidump.tgz (7.34 KB)


JBeulich at novell

Jun 15, 2010, 6:57 AM

Post #9 of 23 (447 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 15:20, "Jan Beulich" <JBeulich [at] novell> wrote:
>>>> On 15.06.10 at 14:56, Keir Fraser <keir.fraser [at] eu> wrote:
>> On 15/06/2010 13:50, "Jan Beulich" <JBeulich [at] novell> wrote:
>>
>>> But even if we verify that the references come from some ACPI
>>> method(s), likely the only way to address this is to fix the kernel
>>> side error handling.
>>>
>>> Keir, assuming these are reads only, would it make sense to permit
>>> Dom0 to map the IO-APIC space read-only? Perhaps even
>>> transparently converting writeable mappings to read-only ones
>>> (since drivers/acpi/osl.c tries to establish writeable mappings
>>> irrespective of the actual needs)? The obvious danger in doing
>>> so is that going forward there may appear fields in that page
>>> reads of which aren't side effect free...
>>
>> Well, how come it works with other Linux kernels -- presumably they have
>> some extra error handling in the ACPI subsystem? Shouldn't that just be
>> added to this kernel?
>
> I'm rather suspecting there's new code (compared to 2.6.18) that's
> lacking proper error handling, though I didn't look in detail so far.
>
> Hmm, looking a little more closely it seems they indeed try to write
> to that space - this we for sure can't allow. I'll see if I can follow
> the code path (unfortunately the stack trace is an imprecise one).

Actually, that's a difference to non-pv-ops that I strongly
believe should be fixed: While in the traditional kernel
__direct_remap_pfn_range() is used to establish I/O memory
mappings (and hence there is a way to propagate errors), the
pv-ops kernel appears to use ioremap_page_range() - just like
native - which can only return -ENOMEM (upon page table
allocation failure), due to the lack of a return value from
set_pte_at().

But then again I must be missing something here, since
xen_set_pte_at() falls back to xen_set_pte() if the hypercall
it tries first fails, and that one would fault when establishing
the mapping, not when trying to first use it. Jeremy?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 15, 2010, 7:13 AM

Post #10 of 23 (450 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 15:49, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> On Tue, Jun 15, 2010 at 9:43 AM, Jan Beulich <JBeulich [at] novell> wrote:
>>>>> On 15.06.10 at 15:24, Cris Daniluk <cris.daniluk [at] gmail> wrote:
>>> For what its worth, it happened in 2.6.32.11 in addition to 2.6.32.13.
>>> I also had earlier tried a 2.6.31 pvops distro kernel with the same
>>> results last week. Interestingly, I also tried a 2.6.31 kernel with
>>> Xen 3.4 on the same hardware with no issues. It seems like XenLinux
>>> kernel with Xen 4.0 is fine, and pvops with Xen 3.x is fine.
>>
>> That's odd, but perhaps the kernel behaves differently on older Xen.
>> Attempts to map IO-APIC space should be disallowed by 3.4 just as
>> with 4.0.
>>
>
> SSDT and DSDT dumps attached.

_SB.PCI0._CRS clears bit 16 of indirect register 0x2e, i.e. it masks pin 15
of the second IO-APIC. The motivation of this is unclear to me (and
can probably only be explained by the writers of that code) - perhaps
a workaround for some erratum?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


pasik at iki

Jun 15, 2010, 7:17 AM

Post #11 of 23 (457 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On Tue, Jun 15, 2010 at 02:43:56PM +0100, Jan Beulich wrote:
> >>> On 15.06.10 at 15:24, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> > For what its worth, it happened in 2.6.32.11 in addition to 2.6.32.13.
> > I also had earlier tried a 2.6.31 pvops distro kernel with the same
> > results last week. Interestingly, I also tried a 2.6.31 kernel with
> > Xen 3.4 on the same hardware with no issues. It seems like XenLinux
> > kernel with Xen 4.0 is fine, and pvops with Xen 3.x is fine.
>
> That's odd, but perhaps the kernel behaves differently on older Xen.
> Attempts to map IO-APIC space should be disallowed by 3.4 just as
> with 4.0.
>

At least Xen 4.0 has the new IOAPIC hypercall that pvops 2.6.32 is using..

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


cris.daniluk at gmail

Jun 15, 2010, 7:35 AM

Post #12 of 23 (458 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On Tue, Jun 15, 2010 at 10:13 AM, Jan Beulich <JBeulich [at] novell> wrote:
>> On 15.06.10 at 15:49, Cris Daniluk <cris.daniluk [at] gmail> wrote:
>>
>> SSDT and DSDT dumps attached.
>
> _SB.PCI0._CRS clears bit 16 of indirect register 0x2e, i.e. it masks pin 15
> of the second IO-APIC. The motivation of this is unclear to me (and
> can probably only be explained by the writers of that code) - perhaps
> a workaround for some erratum?
>
> Jan
>
>

Hmm, so could there be a Xen-based workaround in the interim, such as
bypassing the code page that is triggering this? It seems like that
may not provide too much relief given the nature of the issue.

This is running the latest bios and I don't see anything in the recent
errata pertaining to ACPI in the last number of releases.

Cris

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 15, 2010, 7:58 AM

Post #13 of 23 (450 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 16:35, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> Hmm, so could there be a Xen-based workaround in the interim, such as
> bypassing the code page that is triggering this? It seems like that
> may not provide too much relief given the nature of the issue.

Unfortunately I can't think of anything.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


cris.daniluk at gmail

Jun 15, 2010, 8:01 AM

Post #14 of 23 (453 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On Tue, Jun 15, 2010 at 10:58 AM, Jan Beulich <JBeulich [at] novell> wrote:
>>>> On 15.06.10 at 16:35, Cris Daniluk <cris.daniluk [at] gmail> wrote:
>> Hmm, so could there be a Xen-based workaround in the interim, such as
>> bypassing the code page that is triggering this? It seems like that
>> may not provide too much relief given the nature of the issue.
>
> Unfortunately I can't think of anything.
>
> Jan
>

Fair enough. I will see if I can get some other hardware to test with.
What would the path toward resolution be here? It still seems like Xen
shouldn't be writing into that page, let alone reading..

Cris

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jun 15, 2010, 8:11 AM

Post #15 of 23 (452 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On Tue, Jun 15, 2010 at 05:17:15PM +0300, Pasi Kärkkäinen wrote:
> On Tue, Jun 15, 2010 at 02:43:56PM +0100, Jan Beulich wrote:
> > >>> On 15.06.10 at 15:24, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> > > For what its worth, it happened in 2.6.32.11 in addition to 2.6.32.13.
> > > I also had earlier tried a 2.6.31 pvops distro kernel with the same
> > > results last week. Interestingly, I also tried a 2.6.31 kernel with
> > > Xen 3.4 on the same hardware with no issues. It seems like XenLinux
> > > kernel with Xen 4.0 is fine, and pvops with Xen 3.x is fine.
> >
> > That's odd, but perhaps the kernel behaves differently on older Xen.
> > Attempts to map IO-APIC space should be disallowed by 3.4 just as
> > with 4.0.
> >
>
> At least Xen 4.0 has the new IOAPIC hypercall that pvops 2.6.32 is using..

Yes and no. There are actually two ways of doing this: Using the
xen_acpi_register_gsi (or something akin to this) that makes a hypercall
to set the polarity/level of the IRQ. Then there is another which is to
map the IO APIC registers to dom0. The later is still present in 2.6.31
but not in 2.6.32.The upstream community did not like that mechanism of accessing
the IO APIC registers.

Soo, thanks to Jan's excellent knowledge of the ACPI spec and figuring
out that the DSDT tries to fiddle with the IO APIC registers we know
what the failure is.

Fixing it is another problem. Jan, any suggestions? Fixing the DSDT to
not do the store?

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


konrad.wilk at oracle

Jun 15, 2010, 8:15 AM

Post #16 of 23 (450 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

> >>> Dom0 to map the IO-APIC space read-only? Perhaps even
.. snip
> Actually, that's a difference to non-pv-ops that I strongly
> believe should be fixed: While in the traditional kernel
> __direct_remap_pfn_range() is used to establish I/O memory
> mappings (and hence there is a way to propagate errors), the
> pv-ops kernel appears to use ioremap_page_range() - just like
> native - which can only return -ENOMEM (upon page table
> allocation failure), due to the lack of a return value from
> set_pte_at().
>
> But then again I must be missing something here, since
> xen_set_pte_at() falls back to xen_set_pte() if the hypercall
> it tries first fails, and that one would fault when establishing
> the mapping, not when trying to first use it. Jeremy?

Take a look at xen_set_fixmap, which I think is used for most of those
special addresses. It is mapped to a null-space for the IO APIC
addresses.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 15, 2010, 8:21 AM

Post #17 of 23 (450 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 17:01, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> On Tue, Jun 15, 2010 at 10:58 AM, Jan Beulich <JBeulich [at] novell> wrote:
>>>>> On 15.06.10 at 16:35, Cris Daniluk <cris.daniluk [at] gmail> wrote:
>>> Hmm, so could there be a Xen-based workaround in the interim, such as
>>> bypassing the code page that is triggering this? It seems like that
>>> may not provide too much relief given the nature of the issue.
>>
>> Unfortunately I can't think of anything.
>>
>> Jan
>>
>
> Fair enough. I will see if I can get some other hardware to test with.
> What would the path toward resolution be here? It still seems like Xen
> shouldn't be writing into that page, let alone reading..

I can't think of anything but getting the BIOS fixed, plus getting
handling in proper shape in the kernel.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


bderzhavets at yahoo

Jun 15, 2010, 9:18 AM

Post #18 of 23 (459 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

    There was an old issue with 2.6.32.10 pvops under Xen 4.0 on ASUS P5Q-E board
ACPI dumped SSDT and DSDT  tables ( as requested today), serial log  were submitted by myself  to Yu Ke  and Jeremy was aware of this issue.
I still don't activate acpi_processor (Yu Ke) for 2.6.32.15 under Xen 4.0.1-rc3-pre
for any pvops 2.6.32.X gets  loaded on this board. Very simple solution. But issue stays
unresolved  at least to my knowledge. Same software works fine on ASUS P5Q3 with
acpi_processor hard linked to pvops kernel 2.6.32.X (10 - 15 )

Boris.

--- On Tue, 6/15/10, Jan Beulich <JBeulich [at] novell> wrote:

From: Jan Beulich <JBeulich [at] novell>
Subject: Re: [Xen-devel] Xen 4.0 crashes with pvops kernel
To: "Cris Daniluk" <cris.daniluk [at] gmail>
Cc: "xen-devel [at] lists" <xen-devel [at] lists>
Date: Tuesday, June 15, 2010, 11:21 AM

>>> On 15.06.10 at 17:01, Cris Daniluk <cris.daniluk [at] gmail> wrote:
> On Tue, Jun 15, 2010 at 10:58 AM, Jan Beulich <JBeulich [at] novell> wrote:
>>>>> On 15.06.10 at 16:35, Cris Daniluk <cris.daniluk [at] gmail> wrote:
>>> Hmm, so could there be a Xen-based workaround in the interim, such as
>>> bypassing the code page that is triggering this? It seems like that
>>> may not provide too much relief given the nature of the issue.
>>
>> Unfortunately I can't think of anything.
>>
>> Jan
>>
>
> Fair enough. I will see if I can get some other hardware to test with.
> What would the path toward resolution be here? It still seems like Xen
> shouldn't be writing into that page, let alone reading..

I can't think of anything but getting the BIOS fixed, plus getting
handling in proper shape in the kernel.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 16, 2010, 7:36 AM

Post #19 of 23 (450 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 17:15, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
>> >>> Dom0 to map the IO-APIC space read-only? Perhaps even
> .. snip
>> Actually, that's a difference to non-pv-ops that I strongly
>> believe should be fixed: While in the traditional kernel
>> __direct_remap_pfn_range() is used to establish I/O memory
>> mappings (and hence there is a way to propagate errors), the
>> pv-ops kernel appears to use ioremap_page_range() - just like
>> native - which can only return -ENOMEM (upon page table
>> allocation failure), due to the lack of a return value from
>> set_pte_at().
>>
>> But then again I must be missing something here, since
>> xen_set_pte_at() falls back to xen_set_pte() if the hypercall
>> it tries first fails, and that one would fault when establishing
>> the mapping, not when trying to first use it. Jeremy?
>
> Take a look at xen_set_fixmap, which I think is used for most of those
> special addresses. It is mapped to a null-space for the IO APIC
> addresses.

I don't think that code matters here: execution goes through
acpi_os_map_memory(), and at the time the problem talked
about here happens I think the ioremap() in there ought to
be taken.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 16, 2010, 7:42 AM

Post #20 of 23 (456 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 15.06.10 at 17:11, Konrad Rzeszutek Wilk <konrad.wilk [at] oracle> wrote:
> Fixing it is another problem. Jan, any suggestions? Fixing the DSDT to
> not do the store?

Yes, as I said in an earlier mail on this thread, fixing ACPI would be
the best fix. Short of being able to do so ourselves, the next best
thing is to at least avoid the crash by doing proper error checking
(again, see other responses of mine, especially as to not being
finally sure how the crash happened the way it does in the first
place, i.e. my analysis possibly being flawed altogether).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


jeremy at goop

Jun 24, 2010, 2:29 AM

Post #21 of 23 (429 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On 06/15/2010 02:57 PM, Jan Beulich wrote:
> Actually, that's a difference to non-pv-ops that I strongly
> believe should be fixed: While in the traditional kernel
> __direct_remap_pfn_range() is used to establish I/O memory
> mappings (and hence there is a way to propagate errors), the
> pv-ops kernel appears to use ioremap_page_range() - just like
> native - which can only return -ENOMEM (upon page table
> allocation failure), due to the lack of a return value from
> set_pte_at().
>

So that ioremap() itself will return an error if Xen prevents a mapping?

> But then again I must be missing something here, since
> xen_set_pte_at() falls back to xen_set_pte() if the hypercall
> it tries first fails, and that one would fault when establishing
> the mapping, not when trying to first use it. Jeremy?
>

If the pte has _PAGE_IO set (which all ioremap ptes should), then it
will call xen_set_iomap_pte. This can't fail (not return code), so if
the hypercall fails then it will leave it unmapped. It should at least
print a warn-on in that case.

J

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


JBeulich at novell

Jun 24, 2010, 4:30 AM

Post #22 of 23 (426 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

>>> On 24.06.10 at 11:29, Jeremy Fitzhardinge <jeremy [at] goop> wrote:
> On 06/15/2010 02:57 PM, Jan Beulich wrote:
>> Actually, that's a difference to non-pv-ops that I strongly
>> believe should be fixed: While in the traditional kernel
>> __direct_remap_pfn_range() is used to establish I/O memory
>> mappings (and hence there is a way to propagate errors), the
>> pv-ops kernel appears to use ioremap_page_range() - just like
>> native - which can only return -ENOMEM (upon page table
>> allocation failure), due to the lack of a return value from
>> set_pte_at().
>>
>
> So that ioremap() itself will return an error if Xen prevents a mapping?

Exactly.

>> But then again I must be missing something here, since
>> xen_set_pte_at() falls back to xen_set_pte() if the hypercall
>> it tries first fails, and that one would fault when establishing
>> the mapping, not when trying to first use it. Jeremy?
>>
>
> If the pte has _PAGE_IO set (which all ioremap ptes should), then it
> will call xen_set_iomap_pte. This can't fail (not return code), so if

Ah, right, I apparently looked at the upstream (i.e. DomU-only)
implementation rather than your tree.

> the hypercall fails then it will leave it unmapped. It should at least
> print a warn-on in that case.

Yes, that's the minimal requirement I would say.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel


jeremy at goop

Jun 24, 2010, 2:25 PM

Post #23 of 23 (427 views)
Permalink
Re: Xen 4.0 crashes with pvops kernel [In reply to]

On 06/24/2010 12:30 PM, Jan Beulich wrote:
>> If the pte has _PAGE_IO set (which all ioremap ptes should), then it
>> will call xen_set_iomap_pte. This can't fail (not return code), so if
>>
> Ah, right, I apparently looked at the upstream (i.e. DomU-only)
> implementation rather than your tree.
>

Yes, that has no way to ioremap real hardware.

>> the hypercall fails then it will leave it unmapped. It should at least
>> print a warn-on in that case.
>>
> Yes, that's the minimal requirement I would say.
>

Currently its implemented as a batched multicall, so the site itself
can't check to see if the hypercall worked. I should check that it can
actually be called in a batched context.

J

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xensource.com/xen-devel

Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.