Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Xen: Devel

Workings/effectiveness of the xen-acpi-processor driver

 

 

First page Previous page 1 2 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded


stefan.bader at canonical

Apr 25, 2012, 6:00 AM

Post #1 of 33 (1338 views)
Permalink
Workings/effectiveness of the xen-acpi-processor driver

Since there have been requests about that driver to get backported into 3.2, I
was interested to find out what or how much would be gained by that.

The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
was not very successful as the drivers bail out of the init function because the
first call to acpi_processor_register_performance() returns -ENODEV. There is
some frequency scaling when running without Xen, so I need to do some more
debugging there.

The second system was an Intel one (4 core i7 920 [at] 2) which was
successfully loading the driver. Via xenpm I can see the various frequencies and
also see them being changed. However the cpuidle data out of xenpm looks a bit odd:

#> xenpm get-cpuidle-states 0
Max C-state: C7

cpu id : 0
total C-states : 2
idle time(ms) : 10819311
C0 : transition [00000000000000000001]
residency [00000000000000005398 ms]
C1 : transition [00000000000000000001]
residency [00000000000010819311 ms]
pc3 : [00000000000000000000 ms]
pc6 : [00000000000000000000 ms]
pc7 : [00000000000000000000 ms]
cc3 : [00000000000000000000 ms]
cc6 : [00000000000000000000 ms]

Also gathering samples over 30s does look like only C0 and C1 are used. This
might be because C1E support is enabled in BIOS but when looking at the
intel_idle data in sysfs when running without a hypervisor will show C3 and C6
for the cores. That could have been just a wrong output, so I plugged in a power
meter and compared a kernel running natively and running as dom0 (with and
without the acpi-processor driver).

Native: 175W
dom0: 183W (with only marginal difference between with or without the
processor driver)
[.yes, the system has a somewhat high base consumption which I attribute to a
ridiculously dimensioned graphics subsystem to be running a text console]

This I would take as C3 and C6 really not being used and the frequency scaling
having no impact on the idle system is not that much surprising. But if that was
true it would also limit the usefulness of the turbo mode which I understand
would also be limited by the c-state of the other cores.

Do I misread the data I see? Or maybe its a known limitation? In case it is
worth doing more research I'll gladly try things and gather more data.

Thanks,
Stefan
Attachments: signature.asc (0.88 KB)


konrad.wilk at oracle

Apr 26, 2012, 8:50 AM

Post #2 of 33 (1257 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
> Since there have been requests about that driver to get backported into 3.2, I
> was interested to find out what or how much would be gained by that.
>
> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
> was not very successful as the drivers bail out of the init function because the
> first call to acpi_processor_register_performance() returns -ENODEV. There is
> some frequency scaling when running without Xen, so I need to do some more
> debugging there.

Did you back-port the other components - the ones that turn off the native
frequency scalling?

provide disable_cpufreq() function to disable the API.
xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
xen/cpufreq: Disable the cpu frequency scaling drivers from loading
>
> The second system was an Intel one (4 core i7 920 [at] 2) which was
> successfully loading the driver. Via xenpm I can see the various frequencies and
> also see them being changed. However the cpuidle data out of xenpm looks a bit odd:
>
> #> xenpm get-cpuidle-states 0
> Max C-state: C7
>
> cpu id : 0
> total C-states : 2
> idle time(ms) : 10819311
> C0 : transition [00000000000000000001]
> residency [00000000000000005398 ms]
> C1 : transition [00000000000000000001]
> residency [00000000000010819311 ms]
> pc3 : [00000000000000000000 ms]
> pc6 : [00000000000000000000 ms]
> pc7 : [00000000000000000000 ms]
> cc3 : [00000000000000000000 ms]
> cc6 : [00000000000000000000 ms]
>
> Also gathering samples over 30s does look like only C0 and C1 are used. This

Yes.
> might be because C1E support is enabled in BIOS but when looking at the
> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
> for the cores. That could have been just a wrong output, so I plugged in a power
> meter and compared a kernel running natively and running as dom0 (with and
> without the acpi-processor driver).
>
> Native: 175W
> dom0: 183W (with only marginal difference between with or without the
> processor driver)
> [.yes, the system has a somewhat high base consumption which I attribute to a
> ridiculously dimensioned graphics subsystem to be running a text console]
>
> This I would take as C3 and C6 really not being used and the frequency scaling

To go in deeper modes there is also a need to backport a Xen unstable
hypercall which will allow the kernel to detect the other states besides
C0-C2.

"XEN_SET_PDC query was implemented in c/s 23783:
"ACPI: add _PDC input override mechanism".


> having no impact on the idle system is not that much surprising. But if that was
> true it would also limit the usefulness of the turbo mode which I understand
> would also be limited by the c-state of the other cores.

Hm, I should double-check that - but somehow I thought that Xen independetly
checks for TurboMode and if the P-states are in, then they are activated.

>
> Do I misread the data I see? Or maybe its a known limitation? In case it is
> worth doing more research I'll gladly try things and gather more data.

Just missing some patches.

Oh, and this one:
xen/acpi: Fix Kconfig dependency on CPU_FREQ

Hmm.. I think a patch disappeared somewhere.

>
> Thanks,
> Stefan
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


stefan.bader at canonical

Apr 26, 2012, 9:25 AM

Post #3 of 33 (1251 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>> Since there have been requests about that driver to get backported into 3.2, I
>> was interested to find out what or how much would be gained by that.
>>
>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
>> was not very successful as the drivers bail out of the init function because the
>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>> some frequency scaling when running without Xen, so I need to do some more
>> debugging there.
>
> Did you back-port the other components - the ones that turn off the native
> frequency scalling?
>
> provide disable_cpufreq() function to disable the API.
> xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> xen/cpufreq: Disable the cpu frequency scaling drivers from loading
>>

Yes, here is the full set for reference:

* xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
* xen/acpi: Remove the WARN's as they just create noise.
* xen/acpi: Fix Kconfig dependency on CPU_FREQ
* xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
* xen/acpi-processor: C and P-state driver that uploads said data to hyper
* provide disable_cpufreq() function to disable the API.

>> The second system was an Intel one (4 core i7 920 [at] 2) which was
>> successfully loading the driver. Via xenpm I can see the various frequencies and
>> also see them being changed. However the cpuidle data out of xenpm looks a bit odd:
>>
>> #> xenpm get-cpuidle-states 0
>> Max C-state: C7
>>
>> cpu id : 0
>> total C-states : 2
>> idle time(ms) : 10819311
>> C0 : transition [00000000000000000001]
>> residency [00000000000000005398 ms]
>> C1 : transition [00000000000000000001]
>> residency [00000000000010819311 ms]
>> pc3 : [00000000000000000000 ms]
>> pc6 : [00000000000000000000 ms]
>> pc7 : [00000000000000000000 ms]
>> cc3 : [00000000000000000000 ms]
>> cc6 : [00000000000000000000 ms]
>>
>> Also gathering samples over 30s does look like only C0 and C1 are used. This
>
> Yes.
>> might be because C1E support is enabled in BIOS but when looking at the
>> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
>> for the cores. That could have been just a wrong output, so I plugged in a power
>> meter and compared a kernel running natively and running as dom0 (with and
>> without the acpi-processor driver).
>>
>> Native: 175W
>> dom0: 183W (with only marginal difference between with or without the
>> processor driver)
>> [.yes, the system has a somewhat high base consumption which I attribute to a
>> ridiculously dimensioned graphics subsystem to be running a text console]
>>
>> This I would take as C3 and C6 really not being used and the frequency scaling
>
> To go in deeper modes there is also a need to backport a Xen unstable
> hypercall which will allow the kernel to detect the other states besides
> C0-C2.
>
> "XEN_SET_PDC query was implemented in c/s 23783:
> "ACPI: add _PDC input override mechanism".
>

I see. There is a kernel patch about enabling MWAIT that refers to that...

>
>> having no impact on the idle system is not that much surprising. But if that was
>> true it would also limit the usefulness of the turbo mode which I understand
>> would also be limited by the c-state of the other cores.
>
> Hm, I should double-check that - but somehow I thought that Xen independetly
> checks for TurboMode and if the P-states are in, then they are activated.
>
Turbo mode should be enabled. I had been only looking at a generic overview
about it on Intel site which sounded like it would make more of a difference on
how much one core could get overclocked related to how many cores are active
(and I translated active or not into deeper c-states or not).
Looking at the verbose output of turbostat it seems not to make that much
difference whether 2-4 cores are running. A single core alone could get one more
increment in clock stepping. That does not immediately sound a lot. And of
course how much or long the higher clock is used depends on other factors as
well and is not under OS control.

In the end it is probably quite dynamic and hard to come up with hard facts to
prove its value. Though if I can lower the idle power usage by reaching a bit
further, that would greatly help to justify the effort and potential risk of
backporting...

>>
>> Do I misread the data I see? Or maybe its a known limitation? In case it is
>> worth doing more research I'll gladly try things and gather more data.
>
> Just missing some patches.
>
> Oh, and this one:
> xen/acpi: Fix Kconfig dependency on CPU_FREQ
>
> Hmm.. I think a patch disappeared somewhere.
>
>>
>> Thanks,
>> Stefan
>>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
Attachments: signature.asc (0.88 KB)


konrad.wilk at oracle

Apr 26, 2012, 10:04 AM

Post #4 of 33 (1269 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
> > On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
> >> Since there have been requests about that driver to get backported into 3.2, I
> >> was interested to find out what or how much would be gained by that.
> >>
> >> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
> >> was not very successful as the drivers bail out of the init function because the
> >> first call to acpi_processor_register_performance() returns -ENODEV. There is
> >> some frequency scaling when running without Xen, so I need to do some more
> >> debugging there.
> >
> > Did you back-port the other components - the ones that turn off the native
> > frequency scalling?
> >
> > provide disable_cpufreq() function to disable the API.
> > xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> > xen/cpufreq: Disable the cpu frequency scaling drivers from loading
> >>
>
> Yes, here is the full set for reference:
>
> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
> * xen/acpi: Remove the WARN's as they just create noise.
> * xen/acpi: Fix Kconfig dependency on CPU_FREQ
> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> * xen/acpi-processor: C and P-state driver that uploads said data to hyper
> * provide disable_cpufreq() function to disable the API.

The one thing that I forgot to mention about the:
xen/cpufreq: Disable the cpu frequency scaling drivers from loading.

is that it also fixes a bug - which is that without this, the acpi-cpufreq
would run and change the frequency states based on what dom0 thought was good.
Which meant that Xen hypervisor would run with somebody behind its back
changing the P-states - and would lower the performance in some case
a lot (especially if one used the dom0_max_vcpus=X).
.. snip..
> >> Native: 175W
> >> dom0: 183W (with only marginal difference between with or without the
> >> processor driver)

What happens if you run it xen-acpi-processor.off=1 ?

(That should inhibit the driver from uploading the power management data).

Is the value ~200w?

> >> [.yes, the system has a somewhat high base consumption which I attribute to a
> >> ridiculously dimensioned graphics subsystem to be running a text console]

Heh.
> >>
> >> This I would take as C3 and C6 really not being used and the frequency scaling
> >
> > To go in deeper modes there is also a need to backport a Xen unstable
> > hypercall which will allow the kernel to detect the other states besides
> > C0-C2.
> >
> > "XEN_SET_PDC query was implemented in c/s 23783:
> > "ACPI: add _PDC input override mechanism".
> >
>
> I see. There is a kernel patch about enabling MWAIT that refers to that...

Yeah, I should see about back-porting it in Xen 4.1..

>
> >
> >> having no impact on the idle system is not that much surprising. But if that was
> >> true it would also limit the usefulness of the turbo mode which I understand
> >> would also be limited by the c-state of the other cores.
> >
> > Hm, I should double-check that - but somehow I thought that Xen independetly
> > checks for TurboMode and if the P-states are in, then they are activated.
> >
> Turbo mode should be enabled. I had been only looking at a generic overview
> about it on Intel site which sounded like it would make more of a difference on
> how much one core could get overclocked related to how many cores are active
> (and I translated active or not into deeper c-states or not).
> Looking at the verbose output of turbostat it seems not to make that much
> difference whether 2-4 cores are running. A single core alone could get one more
> increment in clock stepping. That does not immediately sound a lot. And of
> course how much or long the higher clock is used depends on other factors as
> well and is not under OS control.
>
> In the end it is probably quite dynamic and hard to come up with hard facts to
> prove its value. Though if I can lower the idle power usage by reaching a bit
> further, that would greatly help to justify the effort and potential risk of
> backporting...

Did you see the P-states dipping? I don't really know how much using C-states
beyond C-2 matters on baremetal in terms of saving power?

>
> >>
> >> Do I misread the data I see? Or maybe its a known limitation? In case it is
> >> worth doing more research I'll gladly try things and gather more data.

This patchset was driven more by the performance needs. Somehow (and I am not
sure how), using the PM, makes the whole system work much faster. That was
what Ian found out when he ran the Phorenix benchmark.

But looking back, perhaps the reason for this was that the native
cpufreq scaling drivers were running in the back changing the machine's
power states without consulting the Xen hypervisor.

> >
> > Just missing some patches.
> >
> > Oh, and this one:
> > xen/acpi: Fix Kconfig dependency on CPU_FREQ
> >
> > Hmm.. I think a patch disappeared somewhere.
> >
> >>
> >> Thanks,
> >> Stefan
> >>
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel [at] lists
> > http://lists.xen.org/xen-devel
>
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


konrad.wilk at oracle

May 1, 2012, 1:02 PM

Post #5 of 33 (1246 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
> > On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
> >> Since there have been requests about that driver to get backported into 3.2, I
> >> was interested to find out what or how much would be gained by that.
> >>
> >> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
> >> was not very successful as the drivers bail out of the init function because the
> >> first call to acpi_processor_register_performance() returns -ENODEV. There is
> >> some frequency scaling when running without Xen, so I need to do some more
> >> debugging there.
> >
> > Did you back-port the other components - the ones that turn off the native
> > frequency scalling?
> >
> > provide disable_cpufreq() function to disable the API.
> > xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> > xen/cpufreq: Disable the cpu frequency scaling drivers from loading
> >>
>
> Yes, here is the full set for reference:
>
> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
> * xen/acpi: Remove the WARN's as they just create noise.
> * xen/acpi: Fix Kconfig dependency on CPU_FREQ
> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> * xen/acpi-processor: C and P-state driver that uploads said data to hyper
> * provide disable_cpufreq() function to disable the API.

And (Linus just pulled it), you also need this one:
df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.)

>
> >> The second system was an Intel one (4 core i7 920 [at] 2) which was
> >> successfully loading the driver. Via xenpm I can see the various frequencies and
> >> also see them being changed. However the cpuidle data out of xenpm looks a bit odd:
> >>
> >> #> xenpm get-cpuidle-states 0
> >> Max C-state: C7
> >>
> >> cpu id : 0
> >> total C-states : 2
> >> idle time(ms) : 10819311
> >> C0 : transition [00000000000000000001]
> >> residency [00000000000000005398 ms]
> >> C1 : transition [00000000000000000001]
> >> residency [00000000000010819311 ms]
> >> pc3 : [00000000000000000000 ms]
> >> pc6 : [00000000000000000000 ms]
> >> pc7 : [00000000000000000000 ms]
> >> cc3 : [00000000000000000000 ms]
> >> cc6 : [00000000000000000000 ms]
> >>
> >> Also gathering samples over 30s does look like only C0 and C1 are used. This
> >
> > Yes.
> >> might be because C1E support is enabled in BIOS but when looking at the
> >> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
> >> for the cores. That could have been just a wrong output, so I plugged in a power
> >> meter and compared a kernel running natively and running as dom0 (with and
> >> without the acpi-processor driver).
> >>
> >> Native: 175W
> >> dom0: 183W (with only marginal difference between with or without the
> >> processor driver)
> >> [.yes, the system has a somewhat high base consumption which I attribute to a
> >> ridiculously dimensioned graphics subsystem to be running a text console]
> >>
> >> This I would take as C3 and C6 really not being used and the frequency scaling

So the other thing I forgot to note is that C3->C6 have a detrimental
effect on some Intel boxes with Xen. We haven't figured out exactly which ones
and the bug is definitly in the hypervisor. The bug is that when the CPU goes in
those states the NIC ends up being unresponsive. Its like the interrupts stopped
being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears.

> >
> > To go in deeper modes there is also a need to backport a Xen unstable
> > hypercall which will allow the kernel to detect the other states besides
> > C0-C2.
> >
> > "XEN_SET_PDC query was implemented in c/s 23783:
> > "ACPI: add _PDC input override mechanism".
> >
>
> I see. There is a kernel patch about enabling MWAIT that refers to that...

Were there any special things you ran when checking the output? Just plugging
and looking at the results?
>
> >
> >> having no impact on the idle system is not that much surprising. But if that was
> >> true it would also limit the usefulness of the turbo mode which I understand
> >> would also be limited by the c-state of the other cores.
> >
> > Hm, I should double-check that - but somehow I thought that Xen independetly
> > checks for TurboMode and if the P-states are in, then they are activated.

I did a bit of checking around and it does seem that is the case. From what
I have gathered the TurboMode kicks in when the CPU is C0 mode (which should
be obvious), and when the other cores are in anything but C0 mode. And sure
enough that seems to be the case. But I can't get the concrete details whether
the "but C0 mode" means that TurboMode will work better if the C mode is legacy
C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out from
Len Brown more details..
> >
> Turbo mode should be enabled. I had been only looking at a generic overview
> about it on Intel site which sounded like it would make more of a difference on
> how much one core could get overclocked related to how many cores are active
> (and I translated active or not into deeper c-states or not).
> Looking at the verbose output of turbostat it seems not to make that much
> difference whether 2-4 cores are running. A single core alone could get one more
> increment in clock stepping. That does not immediately sound a lot. And of
> course how much or long the higher clock is used depends on other factors as
> well and is not under OS control.
>
> In the end it is probably quite dynamic and hard to come up with hard facts to
> prove its value. Though if I can lower the idle power usage by reaching a bit
> further, that would greatly help to justify the effort and potential risk of
> backporting...

I understand. I wish I could give you the exact percentage points by which
the power usage will drop. But I think the more substantial reason benefit of
these patches is performance gains. The ones that Ian Campbell ran and were
posted on Phorenix site paint that they are beneficial.

>
> >>
> >> Do I misread the data I see? Or maybe its a known limitation? In case it is
> >> worth doing more research I'll gladly try things and gather more data.
> >
> > Just missing some patches.
> >
> > Oh, and this one:
> > xen/acpi: Fix Kconfig dependency on CPU_FREQ
> >
> > Hmm.. I think a patch disappeared somewhere.

That was the one I referenced at the beginning of this email.

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


boris.ostrovsky at amd

May 1, 2012, 3:35 PM

Post #6 of 33 (1242 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>> was interested to find out what or how much would be gained by that.
>>>>
>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
>>>> was not very successful as the drivers bail out of the init function because the
>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>> some frequency scaling when running without Xen, so I need to do some more
>>>> debugging there.

I believe this is caused by the somewhat under-enlightened xen_apic_read():

static u32 xen_apic_read(u32 reg)
{
return 0;
}

This results in some data, most importantly boot_cpu_physical_apicid,
not being set correctly and, in turn, causes x86_cpu_to_apicid to be
broken.

On larger AMD systems boot processor is typically APICID=0x20 (I don't
have Intel system handy to see how it looks there).

As a quick and dirty test you can try:

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index edc2448..1f78998 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long
address)
}
if (boot_cpu_physical_apicid == -1U) {
boot_cpu_physical_apicid = read_apic_id();
+ boot_cpu_physical_apicid = 32;
apic_version[boot_cpu_physical_apicid] =
GET_APIC_VERSION(apic_read(APIC_LVR));
}


(Set it to whatever APICID on core0 is, I suspect it won't be zero).

-boris


>>>
>>> Did you back-port the other components - the ones that turn off the native
>>> frequency scalling?
>>>
>>> provide disable_cpufreq() function to disable the API.
>>> xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>>> xen/cpufreq: Disable the cpu frequency scaling drivers from loading
>>>>
>>
>> Yes, here is the full set for reference:
>>
>> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
>> * xen/acpi: Remove the WARN's as they just create noise.
>> * xen/acpi: Fix Kconfig dependency on CPU_FREQ
>> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>> * xen/acpi-processor: C and P-state driver that uploads said data to hyper
>> * provide disable_cpufreq() function to disable the API.
>
> And (Linus just pulled it), you also need this one:
> df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.)
>
>>
>>>> The second system was an Intel one (4 core i7 920 [at] 2) which was
>>>> successfully loading the driver. Via xenpm I can see the various frequencies and
>>>> also see them being changed. However the cpuidle data out of xenpm looks a bit odd:
>>>>
>>>> #> xenpm get-cpuidle-states 0
>>>> Max C-state: C7
>>>>
>>>> cpu id : 0
>>>> total C-states : 2
>>>> idle time(ms) : 10819311
>>>> C0 : transition [00000000000000000001]
>>>> residency [00000000000000005398 ms]
>>>> C1 : transition [00000000000000000001]
>>>> residency [00000000000010819311 ms]
>>>> pc3 : [00000000000000000000 ms]
>>>> pc6 : [00000000000000000000 ms]
>>>> pc7 : [00000000000000000000 ms]
>>>> cc3 : [00000000000000000000 ms]
>>>> cc6 : [00000000000000000000 ms]
>>>>
>>>> Also gathering samples over 30s does look like only C0 and C1 are used. This
>>>
>>> Yes.
>>>> might be because C1E support is enabled in BIOS but when looking at the
>>>> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
>>>> for the cores. That could have been just a wrong output, so I plugged in a power
>>>> meter and compared a kernel running natively and running as dom0 (with and
>>>> without the acpi-processor driver).
>>>>
>>>> Native: 175W
>>>> dom0: 183W (with only marginal difference between with or without the
>>>> processor driver)
>>>> [.yes, the system has a somewhat high base consumption which I attribute to a
>>>> ridiculously dimensioned graphics subsystem to be running a text console]
>>>>
>>>> This I would take as C3 and C6 really not being used and the frequency scaling
>
> So the other thing I forgot to note is that C3->C6 have a detrimental
> effect on some Intel boxes with Xen. We haven't figured out exactly which ones
> and the bug is definitly in the hypervisor. The bug is that when the CPU goes in
> those states the NIC ends up being unresponsive. Its like the interrupts stopped
> being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears.
>
>>>
>>> To go in deeper modes there is also a need to backport a Xen unstable
>>> hypercall which will allow the kernel to detect the other states besides
>>> C0-C2.
>>>
>>> "XEN_SET_PDC query was implemented in c/s 23783:
>>> "ACPI: add _PDC input override mechanism".
>>>
>>
>> I see. There is a kernel patch about enabling MWAIT that refers to that...
>
> Were there any special things you ran when checking the output? Just plugging
> and looking at the results?
>>
>>>
>>>> having no impact on the idle system is not that much surprising. But if that was
>>>> true it would also limit the usefulness of the turbo mode which I understand
>>>> would also be limited by the c-state of the other cores.
>>>
>>> Hm, I should double-check that - but somehow I thought that Xen independetly
>>> checks for TurboMode and if the P-states are in, then they are activated.
>
> I did a bit of checking around and it does seem that is the case. From what
> I have gathered the TurboMode kicks in when the CPU is C0 mode (which should
> be obvious), and when the other cores are in anything but C0 mode. And sure
> enough that seems to be the case. But I can't get the concrete details whether
> the "but C0 mode" means that TurboMode will work better if the C mode is legacy
> C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out from
> Len Brown more details..
>>>
>> Turbo mode should be enabled. I had been only looking at a generic overview
>> about it on Intel site which sounded like it would make more of a difference on
>> how much one core could get overclocked related to how many cores are active
>> (and I translated active or not into deeper c-states or not).
>> Looking at the verbose output of turbostat it seems not to make that much
>> difference whether 2-4 cores are running. A single core alone could get one more
>> increment in clock stepping. That does not immediately sound a lot. And of
>> course how much or long the higher clock is used depends on other factors as
>> well and is not under OS control.
>>
>> In the end it is probably quite dynamic and hard to come up with hard facts to
>> prove its value. Though if I can lower the idle power usage by reaching a bit
>> further, that would greatly help to justify the effort and potential risk of
>> backporting...
>
> I understand. I wish I could give you the exact percentage points by which
> the power usage will drop. But I think the more substantial reason benefit of
> these patches is performance gains. The ones that Ian Campbell ran and were
> posted on Phorenix site paint that they are beneficial.
>
>>
>>>>
>>>> Do I misread the data I see? Or maybe its a known limitation? In case it is
>>>> worth doing more research I'll gladly try things and gather more data.
>>>
>>> Just missing some patches.
>>>
>>> Oh, and this one:
>>> xen/acpi: Fix Kconfig dependency on CPU_FREQ
>>>
>>> Hmm.. I think a patch disappeared somewhere.
>
> That was the one I referenced at the beginning of this email.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


konrad.wilk at oracle

May 1, 2012, 3:54 PM

Post #7 of 33 (1239 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Tue, May 01, 2012 at 06:35:45PM -0400, Boris Ostrovsky wrote:
> On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
> >On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
> >>On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
> >>>On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
> >>>>Since there have been requests about that driver to get backported into 3.2, I
> >>>>was interested to find out what or how much would be gained by that.
> >>>>
> >>>>The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
> >>>>was not very successful as the drivers bail out of the init function because the
> >>>>first call to acpi_processor_register_performance() returns -ENODEV. There is
> >>>>some frequency scaling when running without Xen, so I need to do some more
> >>>>debugging there.
>
> I believe this is caused by the somewhat under-enlightened xen_apic_read():
>
> static u32 xen_apic_read(u32 reg)
> {
> return 0;
> }
>
> This results in some data, most importantly
> boot_cpu_physical_apicid, not being set correctly and, in turn,
> causes x86_cpu_to_apicid to be broken.

What is the involvment of x86_cpu_to_apicid to acpi_processor_register_performance?
Or is this more of a stab in the dark?

Stefan, one way to debug this is to make the driver be a module and then
configure the /sys/../acpi/debug_level and debug_layer to be 0xffffffff
and try loading the module. It should print out tons of data (And the reason
it returned -Exxx).
>
> On larger AMD systems boot processor is typically APICID=0x20 (I
> don't have Intel system handy to see how it looks there).
>
> As a quick and dirty test you can try:
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index edc2448..1f78998 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned
> long address)
> }
> if (boot_cpu_physical_apicid == -1U) {
> boot_cpu_physical_apicid = read_apic_id();
> + boot_cpu_physical_apicid = 32;
> apic_version[boot_cpu_physical_apicid] =
> GET_APIC_VERSION(apic_read(APIC_LVR));
> }
>
>
> (Set it to whatever APICID on core0 is, I suspect it won't be zero).
>
> -boris
>
>
> >>>
> >>>Did you back-port the other components - the ones that turn off the native
> >>>frequency scalling?
> >>>
> >>> provide disable_cpufreq() function to disable the API.
> >>> xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> >>> xen/cpufreq: Disable the cpu frequency scaling drivers from loading
> >>>>
> >>
> >>Yes, here is the full set for reference:
> >>
> >>* xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
> >>* xen/acpi: Remove the WARN's as they just create noise.
> >>* xen/acpi: Fix Kconfig dependency on CPU_FREQ
> >>* xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
> >>* xen/acpi-processor: C and P-state driver that uploads said data to hyper
> >>* provide disable_cpufreq() function to disable the API.
> >
> >And (Linus just pulled it), you also need this one:
> > df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.)
> >
> >>
> >>>>The second system was an Intel one (4 core i7 920 [at] 2) which was
> >>>>successfully loading the driver. Via xenpm I can see the various frequencies and
> >>>>also see them being changed. However the cpuidle data out of xenpm looks a bit odd:
> >>>>
> >>>>#> xenpm get-cpuidle-states 0
> >>>>Max C-state: C7
> >>>>
> >>>>cpu id : 0
> >>>>total C-states : 2
> >>>>idle time(ms) : 10819311
> >>>>C0 : transition [00000000000000000001]
> >>>> residency [00000000000000005398 ms]
> >>>>C1 : transition [00000000000000000001]
> >>>> residency [00000000000010819311 ms]
> >>>>pc3 : [00000000000000000000 ms]
> >>>>pc6 : [00000000000000000000 ms]
> >>>>pc7 : [00000000000000000000 ms]
> >>>>cc3 : [00000000000000000000 ms]
> >>>>cc6 : [00000000000000000000 ms]
> >>>>
> >>>>Also gathering samples over 30s does look like only C0 and C1 are used. This
> >>>
> >>>Yes.
> >>>>might be because C1E support is enabled in BIOS but when looking at the
> >>>>intel_idle data in sysfs when running without a hypervisor will show C3 and C6
> >>>>for the cores. That could have been just a wrong output, so I plugged in a power
> >>>>meter and compared a kernel running natively and running as dom0 (with and
> >>>>without the acpi-processor driver).
> >>>>
> >>>>Native: 175W
> >>>>dom0: 183W (with only marginal difference between with or without the
> >>>> processor driver)
> >>>>[.yes, the system has a somewhat high base consumption which I attribute to a
> >>>>ridiculously dimensioned graphics subsystem to be running a text console]
> >>>>
> >>>>This I would take as C3 and C6 really not being used and the frequency scaling
> >
> >So the other thing I forgot to note is that C3->C6 have a detrimental
> >effect on some Intel boxes with Xen. We haven't figured out exactly which ones
> >and the bug is definitly in the hypervisor. The bug is that when the CPU goes in
> >those states the NIC ends up being unresponsive. Its like the interrupts stopped
> >being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears.
> >
> >>>
> >>>To go in deeper modes there is also a need to backport a Xen unstable
> >>>hypercall which will allow the kernel to detect the other states besides
> >>>C0-C2.
> >>>
> >>>"XEN_SET_PDC query was implemented in c/s 23783:
> >>> "ACPI: add _PDC input override mechanism".
> >>>
> >>
> >>I see. There is a kernel patch about enabling MWAIT that refers to that...
> >
> >Were there any special things you ran when checking the output? Just plugging
> >and looking at the results?
> >>
> >>>
> >>>>having no impact on the idle system is not that much surprising. But if that was
> >>>>true it would also limit the usefulness of the turbo mode which I understand
> >>>>would also be limited by the c-state of the other cores.
> >>>
> >>>Hm, I should double-check that - but somehow I thought that Xen independetly
> >>>checks for TurboMode and if the P-states are in, then they are activated.
> >
> >I did a bit of checking around and it does seem that is the case. From what
> >I have gathered the TurboMode kicks in when the CPU is C0 mode (which should
> >be obvious), and when the other cores are in anything but C0 mode. And sure
> >enough that seems to be the case. But I can't get the concrete details whether
> >the "but C0 mode" means that TurboMode will work better if the C mode is legacy
> >C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out from
> >Len Brown more details..
> >>>
> >>Turbo mode should be enabled. I had been only looking at a generic overview
> >>about it on Intel site which sounded like it would make more of a difference on
> >>how much one core could get overclocked related to how many cores are active
> >>(and I translated active or not into deeper c-states or not).
> >>Looking at the verbose output of turbostat it seems not to make that much
> >>difference whether 2-4 cores are running. A single core alone could get one more
> >>increment in clock stepping. That does not immediately sound a lot. And of
> >>course how much or long the higher clock is used depends on other factors as
> >>well and is not under OS control.
> >>
> >>In the end it is probably quite dynamic and hard to come up with hard facts to
> >>prove its value. Though if I can lower the idle power usage by reaching a bit
> >>further, that would greatly help to justify the effort and potential risk of
> >>backporting...
> >
> >I understand. I wish I could give you the exact percentage points by which
> >the power usage will drop. But I think the more substantial reason benefit of
> >these patches is performance gains. The ones that Ian Campbell ran and were
> >posted on Phorenix site paint that they are beneficial.
> >
> >>
> >>>>
> >>>>Do I misread the data I see? Or maybe its a known limitation? In case it is
> >>>>worth doing more research I'll gladly try things and gather more data.
> >>>
> >>>Just missing some patches.
> >>>
> >>>Oh, and this one:
> >>> xen/acpi: Fix Kconfig dependency on CPU_FREQ
> >>>
> >>>Hmm.. I think a patch disappeared somewhere.
> >
> >That was the one I referenced at the beginning of this email.
> >
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel [at] lists
> >http://lists.xen.org/xen-devel
> >
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


konrad.wilk at oracle

May 1, 2012, 5:47 PM

Post #8 of 33 (1286 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Tue, May 01, 2012 at 06:54:56PM -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, May 01, 2012 at 06:35:45PM -0400, Boris Ostrovsky wrote:
> > On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
> > >On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
> > >>On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
> > >>>On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
> > >>>>Since there have been requests about that driver to get backported into 3.2, I
> > >>>>was interested to find out what or how much would be gained by that.
> > >>>>
> > >>>>The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
> > >>>>was not very successful as the drivers bail out of the init function because the
> > >>>>first call to acpi_processor_register_performance() returns -ENODEV. There is
> > >>>>some frequency scaling when running without Xen, so I need to do some more
> > >>>>debugging there.
> >
> > I believe this is caused by the somewhat under-enlightened xen_apic_read():
> >
> > static u32 xen_apic_read(u32 reg)
> > {
> > return 0;
> > }
> >
> > This results in some data, most importantly
> > boot_cpu_physical_apicid, not being set correctly and, in turn,
> > causes x86_cpu_to_apicid to be broken.
>
> What is the involvment of x86_cpu_to_apicid to acpi_processor_register_performance?
> Or is this more of a stab in the dark?

Ah, it is the acpi_get_cpuid that gets called by acpi_processor_add->acpi_processor_get_info.

And this one:
201 #ifdef CONFIG_SMP
202 for_each_possible_cpu(i) {
203 if (cpu_physical_id(i) == apic_id)
204 return i;
205 }
206 #else

where the cpu_physical_id(i) is per_cpu(x86_cpu_to_apicid, i).

But it is curious that it has been working for me on AMD and Intel machines.
Granted the only server boxes I've are Intel - don't have AMD server boxes at all.

Stefan, can you send the full dmesg output too please?

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


boris.ostrovsky at amd

May 1, 2012, 6:11 PM

Post #9 of 33 (1246 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 05/01/2012 08:47 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, May 01, 2012 at 06:54:56PM -0400, Konrad Rzeszutek Wilk wrote:
>> On Tue, May 01, 2012 at 06:35:45PM -0400, Boris Ostrovsky wrote:
>>> On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
>>>> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>>>>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>>>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>>>>> was interested to find out what or how much would be gained by that.
>>>>>>>
>>>>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
>>>>>>> was not very successful as the drivers bail out of the init function because the
>>>>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>>>>> some frequency scaling when running without Xen, so I need to do some more
>>>>>>> debugging there.
>>>
>>> I believe this is caused by the somewhat under-enlightened xen_apic_read():
>>>
>>> static u32 xen_apic_read(u32 reg)
>>> {
>>> return 0;
>>> }
>>>
>>> This results in some data, most importantly
>>> boot_cpu_physical_apicid, not being set correctly and, in turn,
>>> causes x86_cpu_to_apicid to be broken.
>>
>> What is the involvment of x86_cpu_to_apicid to acpi_processor_register_performance?
>> Or is this more of a stab in the dark?
>
> Ah, it is the acpi_get_cpuid that gets called by acpi_processor_add->acpi_processor_get_info.
>
> And this one:
> 201 #ifdef CONFIG_SMP
> 202 for_each_possible_cpu(i) {
> 203 if (cpu_physical_id(i) == apic_id)
> 204 return i;
> 205 }
> 206 #else
>
> where the cpu_physical_id(i) is per_cpu(x86_cpu_to_apicid, i).

Right. This and the fact that 'if (apicid == boot_cpu_physical_apicid)'
in generic_processor_info() is never true.

In the end, 'processors' per-cpu variable is not initialized for cpu 0
and that's what causes acpi_processor_register_performance() to fail.

>
> But it is curious that it has been working for me on AMD and Intel machines.
> Granted the only server boxes I've are Intel - don't have AMD server boxes at all.

I am also surprised that aside from some power inefficiencies and "BIOS
bug" warnings the system appeared reasonably OK. I'd think that with
APIC IDs messed up it would not run.

If Intel processors number cores starting with APICID=0 then you
wouldn't see any issues.

-boris

>
> Stefan, can you send the full dmesg output too please?
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


stefan.bader at canonical

May 2, 2012, 1:22 AM

Post #10 of 33 (1308 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

Sorry for not responding sooner, there was a good opportunity for a long week
end here I could not miss. Also I had been moving the power meter over to the
other machine.

On 01.05.2012 22:02, Konrad Rzeszutek Wilk wrote:
> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>> was interested to find out what or how much would be gained by that.
>>>>
>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH). Which
>>>> was not very successful as the drivers bail out of the init function because the
>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>> some frequency scaling when running without Xen, so I need to do some more
>>>> debugging there.
>>>
>>> Did you back-port the other components - the ones that turn off the native
>>> frequency scalling?
>>>
>>> provide disable_cpufreq() function to disable the API.
>>> xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>>> xen/cpufreq: Disable the cpu frequency scaling drivers from loading
>>>>
>>
>> Yes, here is the full set for reference:
>>
>> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
>> * xen/acpi: Remove the WARN's as they just create noise.
>> * xen/acpi: Fix Kconfig dependency on CPU_FREQ
>> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>> * xen/acpi-processor: C and P-state driver that uploads said data to hyper
>> * provide disable_cpufreq() function to disable the API.
>
> And (Linus just pulled it), you also need this one:
> df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.)

I suspect that one only if I would have

commit 73c154c60be106b47f15d1111fc2d75cc7a436f2
xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.

and the change in the hypervisor to implement XEN_SET_PDC.

>
>>
>>>> The second system was an Intel one (4 core i7 920 [at] 2) which was
>>>> successfully loading the driver. Via xenpm I can see the various frequencies and
>>>> also see them being changed. However the cpuidle data out of xenpm looks a bit odd:
>>>>
>>>> #> xenpm get-cpuidle-states 0
>>>> Max C-state: C7
>>>>
>>>> cpu id : 0
>>>> total C-states : 2
>>>> idle time(ms) : 10819311
>>>> C0 : transition [00000000000000000001]
>>>> residency [00000000000000005398 ms]
>>>> C1 : transition [00000000000000000001]
>>>> residency [00000000000010819311 ms]
>>>> pc3 : [00000000000000000000 ms]
>>>> pc6 : [00000000000000000000 ms]
>>>> pc7 : [00000000000000000000 ms]
>>>> cc3 : [00000000000000000000 ms]
>>>> cc6 : [00000000000000000000 ms]
>>>>
>>>> Also gathering samples over 30s does look like only C0 and C1 are used. This
>>>
>>> Yes.
>>>> might be because C1E support is enabled in BIOS but when looking at the
>>>> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
>>>> for the cores. That could have been just a wrong output, so I plugged in a power
>>>> meter and compared a kernel running natively and running as dom0 (with and
>>>> without the acpi-processor driver).
>>>>
>>>> Native: 175W
>>>> dom0: 183W (with only marginal difference between with or without the
>>>> processor driver)
>>>> [.yes, the system has a somewhat high base consumption which I attribute to a
>>>> ridiculously dimensioned graphics subsystem to be running a text console]
>>>>
>>>> This I would take as C3 and C6 really not being used and the frequency scaling
>
> So the other thing I forgot to note is that C3->C6 have a detrimental
> effect on some Intel boxes with Xen. We haven't figured out exactly which ones
> and the bug is definitly in the hypervisor. The bug is that when the CPU goes in
> those states the NIC ends up being unresponsive. Its like the interrupts stopped
> being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears.
>

I think I saw weird behavior not only under Xen with that. One netbook I got has
the nasty tendency to go into undetermined sleeps and you had to hit some keys
to keep it running. In that case I found out that the intel gfx card was one
which does not cause interrupts on vsync and other sources seemed not able to
wake up from deeper c-states when being msi. So the wonder cure there is to
force interrupts not being msi.

That aside, I was also comparing power usage in idle on the AMD box. And that
shows an even bigger difference between running natively and running dom0 under
the hypervisor (103W against 127W). As that is AMD there never are other
c-states than c1(e) which xenpm shows as being used (got the feeling that with
AMD CPUs it is quite hard to find out the c1 residency nowadays (only intel_idle
keeps statistics). I think I got some data now but still need to parse it (if I
am not wrong one needs to use power trace now).
But regardless it would hint that having the deeper c-states may not be the real
solution.

>>>
>>> To go in deeper modes there is also a need to backport a Xen unstable
>>> hypercall which will allow the kernel to detect the other states besides
>>> C0-C2.
>>>
>>> "XEN_SET_PDC query was implemented in c/s 23783:
>>> "ACPI: add _PDC input override mechanism".
>>>
>>
>> I see. There is a kernel patch about enabling MWAIT that refers to that...
>
> Were there any special things you ran when checking the output? Just plugging
> and looking at the results?

In the first step it only was the idle power usage. But as you say the real
target for the patches were a better overall performance under use. So I need to
think of something to quantify the difference.

>>
>>>
>>>> having no impact on the idle system is not that much surprising. But if that was
>>>> true it would also limit the usefulness of the turbo mode which I understand
>>>> would also be limited by the c-state of the other cores.
>>>
>>> Hm, I should double-check that - but somehow I thought that Xen independetly
>>> checks for TurboMode and if the P-states are in, then they are activated.
>
> I did a bit of checking around and it does seem that is the case. From what
> I have gathered the TurboMode kicks in when the CPU is C0 mode (which should
> be obvious), and when the other cores are in anything but C0 mode. And sure
> enough that seems to be the case. But I can't get the concrete details whether
> the "but C0 mode" means that TurboMode will work better if the C mode is legacy
> C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out from
> Len Brown more details..
>>>

Right, that is how I understood things, too. Depending on how many cores are
"active" (which does not really say what active means) the frequency can go up
in steps of 133Mhz (or was that 200) up to a certain upper limit. The upper
limit depends on how many cores are active but for my CPU seems just one more
step between 2-4 cores active and 1 core active.
The rest is not under OS control beside the CPU to go up in frequency need to be
in P0 mode. From the looks other cores can be in P0 as well and can go up a bit
as well.

>> Turbo mode should be enabled. I had been only looking at a generic overview
>> about it on Intel site which sounded like it would make more of a difference on
>> how much one core could get overclocked related to how many cores are active
>> (and I translated active or not into deeper c-states or not).
>> Looking at the verbose output of turbostat it seems not to make that much
>> difference whether 2-4 cores are running. A single core alone could get one more
>> increment in clock stepping. That does not immediately sound a lot. And of
>> course how much or long the higher clock is used depends on other factors as
>> well and is not under OS control.
>>
>> In the end it is probably quite dynamic and hard to come up with hard facts to
>> prove its value. Though if I can lower the idle power usage by reaching a bit
>> further, that would greatly help to justify the effort and potential risk of
>> backporting...
>
> I understand. I wish I could give you the exact percentage points by which
> the power usage will drop. But I think the more substantial reason benefit of
> these patches is performance gains. The ones that Ian Campbell ran and were
> posted on Phorenix site paint that they are beneficial.

I probably should ask Ian (or check on that post) what tests he was using to
measure the difference.

>
>>
>>>>
>>>> Do I misread the data I see? Or maybe its a known limitation? In case it is
>>>> worth doing more research I'll gladly try things and gather more data.
>>>
>>> Just missing some patches.
>>>
>>> Oh, and this one:
>>> xen/acpi: Fix Kconfig dependency on CPU_FREQ
>>>
>>> Hmm.. I think a patch disappeared somewhere.
>
> That was the one I referenced at the beginning of this email.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
Attachments: signature.asc (0.88 KB)


stefan.bader at canonical

May 2, 2012, 1:36 AM

Post #11 of 33 (1244 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 02.05.2012 00:35, Boris Ostrovsky wrote:
> On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
>> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>>> was interested to find out what or how much would be gained by that.
>>>>>
>>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH).
>>>>> Which
>>>>> was not very successful as the drivers bail out of the init function
>>>>> because the
>>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>>> some frequency scaling when running without Xen, so I need to do some more
>>>>> debugging there.
>
> I believe this is caused by the somewhat under-enlightened xen_apic_read():
>
> static u32 xen_apic_read(u32 reg)
> {
> return 0;
> }
>
> This results in some data, most importantly boot_cpu_physical_apicid, not being
> set correctly and, in turn, causes x86_cpu_to_apicid to be broken.

Ah ok. I check what my box say and try the change below and gathering more data
as suggested in the follow-ups (including to turn on the acpi debugging and
debugging in the xen acpi processor driver). The latter I had done but that only
would print "max acpi id: 16" (or so) before the failure. No wonder missing the
acpi debugging.

>
> On larger AMD systems boot processor is typically APICID=0x20 (I don't have
> Intel system handy to see how it looks there).
>
> As a quick and dirty test you can try:
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index edc2448..1f78998 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long address)
> }
> if (boot_cpu_physical_apicid == -1U) {
> boot_cpu_physical_apicid = read_apic_id();
> + boot_cpu_physical_apicid = 32;
> apic_version[boot_cpu_physical_apicid] =
> GET_APIC_VERSION(apic_read(APIC_LVR));
> }
>
>
> (Set it to whatever APICID on core0 is, I suspect it won't be zero).
>
> -boris
>
>
>>>>
>>>> Did you back-port the other components - the ones that turn off the native
>>>> frequency scalling?
>>>>
>>>> provide disable_cpufreq() function to disable the API.
>>>> xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>>>> xen/cpufreq: Disable the cpu frequency scaling drivers from loading
>>>>>
>>>
>>> Yes, here is the full set for reference:
>>>
>>> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
>>> * xen/acpi: Remove the WARN's as they just create noise.
>>> * xen/acpi: Fix Kconfig dependency on CPU_FREQ
>>> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>>> * xen/acpi-processor: C and P-state driver that uploads said data to hyper
>>> * provide disable_cpufreq() function to disable the API.
>>
>> And (Linus just pulled it), you also need this one:
>> df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWAIT_LEAF
>> so that acpi-pad won't be loaded.)
>>
>>>
>>>>> The second system was an Intel one (4 core i7 920 [at] 2) which was
>>>>> successfully loading the driver. Via xenpm I can see the various
>>>>> frequencies and
>>>>> also see them being changed. However the cpuidle data out of xenpm looks a
>>>>> bit odd:
>>>>>
>>>>> #> xenpm get-cpuidle-states 0
>>>>> Max C-state: C7
>>>>>
>>>>> cpu id : 0
>>>>> total C-states : 2
>>>>> idle time(ms) : 10819311
>>>>> C0 : transition [00000000000000000001]
>>>>> residency [00000000000000005398 ms]
>>>>> C1 : transition [00000000000000000001]
>>>>> residency [00000000000010819311 ms]
>>>>> pc3 : [00000000000000000000 ms]
>>>>> pc6 : [00000000000000000000 ms]
>>>>> pc7 : [00000000000000000000 ms]
>>>>> cc3 : [00000000000000000000 ms]
>>>>> cc6 : [00000000000000000000 ms]
>>>>>
>>>>> Also gathering samples over 30s does look like only C0 and C1 are used. This
>>>>
>>>> Yes.
>>>>> might be because C1E support is enabled in BIOS but when looking at the
>>>>> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
>>>>> for the cores. That could have been just a wrong output, so I plugged in a
>>>>> power
>>>>> meter and compared a kernel running natively and running as dom0 (with and
>>>>> without the acpi-processor driver).
>>>>>
>>>>> Native: 175W
>>>>> dom0: 183W (with only marginal difference between with or without the
>>>>> processor driver)
>>>>> [.yes, the system has a somewhat high base consumption which I attribute to a
>>>>> ridiculously dimensioned graphics subsystem to be running a text console]
>>>>>
>>>>> This I would take as C3 and C6 really not being used and the frequency scaling
>>
>> So the other thing I forgot to note is that C3->C6 have a detrimental
>> effect on some Intel boxes with Xen. We haven't figured out exactly which ones
>> and the bug is definitly in the hypervisor. The bug is that when the CPU goes in
>> those states the NIC ends up being unresponsive. Its like the interrupts stopped
>> being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears.
>>
>>>>
>>>> To go in deeper modes there is also a need to backport a Xen unstable
>>>> hypercall which will allow the kernel to detect the other states besides
>>>> C0-C2.
>>>>
>>>> "XEN_SET_PDC query was implemented in c/s 23783:
>>>> "ACPI: add _PDC input override mechanism".
>>>>
>>>
>>> I see. There is a kernel patch about enabling MWAIT that refers to that...
>>
>> Were there any special things you ran when checking the output? Just plugging
>> and looking at the results?
>>>
>>>>
>>>>> having no impact on the idle system is not that much surprising. But if
>>>>> that was
>>>>> true it would also limit the usefulness of the turbo mode which I understand
>>>>> would also be limited by the c-state of the other cores.
>>>>
>>>> Hm, I should double-check that - but somehow I thought that Xen independetly
>>>> checks for TurboMode and if the P-states are in, then they are activated.
>>
>> I did a bit of checking around and it does seem that is the case. From what
>> I have gathered the TurboMode kicks in when the CPU is C0 mode (which should
>> be obvious), and when the other cores are in anything but C0 mode. And sure
>> enough that seems to be the case. But I can't get the concrete details whether
>> the "but C0 mode" means that TurboMode will work better if the C mode is legacy
>> C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out from
>> Len Brown more details..
>>>>
>>> Turbo mode should be enabled. I had been only looking at a generic overview
>>> about it on Intel site which sounded like it would make more of a difference on
>>> how much one core could get overclocked related to how many cores are active
>>> (and I translated active or not into deeper c-states or not).
>>> Looking at the verbose output of turbostat it seems not to make that much
>>> difference whether 2-4 cores are running. A single core alone could get one more
>>> increment in clock stepping. That does not immediately sound a lot. And of
>>> course how much or long the higher clock is used depends on other factors as
>>> well and is not under OS control.
>>>
>>> In the end it is probably quite dynamic and hard to come up with hard facts to
>>> prove its value. Though if I can lower the idle power usage by reaching a bit
>>> further, that would greatly help to justify the effort and potential risk of
>>> backporting...
>>
>> I understand. I wish I could give you the exact percentage points by which
>> the power usage will drop. But I think the more substantial reason benefit of
>> these patches is performance gains. The ones that Ian Campbell ran and were
>> posted on Phorenix site paint that they are beneficial.
>>
>>>
>>>>>
>>>>> Do I misread the data I see? Or maybe its a known limitation? In case it is
>>>>> worth doing more research I'll gladly try things and gather more data.
>>>>
>>>> Just missing some patches.
>>>>
>>>> Oh, and this one:
>>>> xen/acpi: Fix Kconfig dependency on CPU_FREQ
>>>>
>>>> Hmm.. I think a patch disappeared somewhere.
>>
>> That was the one I referenced at the beginning of this email.
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel [at] lists
>> http://lists.xen.org/xen-devel
>>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
Attachments: signature.asc (0.88 KB)


JBeulich at suse

May 2, 2012, 2:19 AM

Post #12 of 33 (1242 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

>>> On 02.05.12 at 03:11, Boris Ostrovsky <boris.ostrovsky [at] amd> wrote:
> On 05/01/2012 08:47 PM, Konrad Rzeszutek Wilk wrote:
>> But it is curious that it has been working for me on AMD and Intel machines.
>> Granted the only server boxes I've are Intel - don't have AMD server boxes at
> all.
>
> I am also surprised that aside from some power inefficiencies and "BIOS
> bug" warnings the system appeared reasonably OK. I'd think that with
> APIC IDs messed up it would not run.

That's not suprising at all to me, given that the entire LAPIC and
interrupt management happens in Xen and not in Dom0. Hence
P- and C-state management code is likely the only consumer of
this (always going to be bogus under Xen) data.

(In our kernels, I specifically removed all traces of these, as any
use of them can only have bad effects. In pv-ops this not being
an option, the only alternative appears to be to find and adjust
each one individually, now and forever.)

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


stefan.bader at canonical

May 2, 2012, 7:56 AM

Post #13 of 33 (1254 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 02.05.2012 00:54, Konrad Rzeszutek Wilk wrote:

> What is the involvment of x86_cpu_to_apicid to acpi_processor_register_performance?
> Or is this more of a stab in the dark?
>
> Stefan, one way to debug this is to make the driver be a module and then
> configure the /sys/../acpi/debug_level and debug_layer to be 0xffffffff
> and try loading the module. It should print out tons of data (And the reason
> it returned -Exxx).

For completeness here the run without the modification Boris had suggested.
Attachments: dmesg.xen.2 (224 KB)
  signature.asc (0.88 KB)


stefan.bader at canonical

May 2, 2012, 8:01 AM

Post #14 of 33 (1242 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 02.05.2012 00:35, Boris Ostrovsky wrote:
> On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
>> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>>> was interested to find out what or how much would be gained by that.
>>>>>
>>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH).
>>>>> Which
>>>>> was not very successful as the drivers bail out of the init function
>>>>> because the
>>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>>> some frequency scaling when running without Xen, so I need to do some more
>>>>> debugging there.
>
> I believe this is caused by the somewhat under-enlightened xen_apic_read():
>
> static u32 xen_apic_read(u32 reg)
> {
> return 0;
> }
>
> This results in some data, most importantly boot_cpu_physical_apicid, not being
> set correctly and, in turn, causes x86_cpu_to_apicid to be broken.
>
> On larger AMD systems boot processor is typically APICID=0x20 (I don't have
> Intel system handy to see how it looks there).
>
> As a quick and dirty test you can try:
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index edc2448..1f78998 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long address)
> }
> if (boot_cpu_physical_apicid == -1U) {
> boot_cpu_physical_apicid = read_apic_id();
> + boot_cpu_physical_apicid = 32;
> apic_version[boot_cpu_physical_apicid] =
> GET_APIC_VERSION(apic_read(APIC_LVR));
> }
>
>
> (Set it to whatever APICID on core0 is, I suspect it won't be zero).
>

Indeed, when I hack the above id to be 0x10 (as my dmesg tells) most of the BIOS
bug messages are gone and the xen-acpi-processor driver successfully loads and
seems to be switching frequencies ok (just a quick tight loop which made one cpu
go from P4 to P0).

-Stefan
Attachments: dmesg.xen.3 (64.0 KB)
  signature.asc (0.88 KB)


konrad.wilk at oracle

May 2, 2012, 9:08 AM

Post #15 of 33 (1241 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Wed, May 02, 2012 at 05:01:00PM +0200, Stefan Bader wrote:
> On 02.05.2012 00:35, Boris Ostrovsky wrote:
> > On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
> >> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
> >>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
> >>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
> >>>>> Since there have been requests about that driver to get backported into 3.2, I
> >>>>> was interested to find out what or how much would be gained by that.
> >>>>>
> >>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH).
> >>>>> Which
> >>>>> was not very successful as the drivers bail out of the init function
> >>>>> because the
> >>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
> >>>>> some frequency scaling when running without Xen, so I need to do some more
> >>>>> debugging there.
> >
> > I believe this is caused by the somewhat under-enlightened xen_apic_read():
> >
> > static u32 xen_apic_read(u32 reg)
> > {
> > return 0;
> > }
> >
> > This results in some data, most importantly boot_cpu_physical_apicid, not being
> > set correctly and, in turn, causes x86_cpu_to_apicid to be broken.
> >
> > On larger AMD systems boot processor is typically APICID=0x20 (I don't have
> > Intel system handy to see how it looks there).
> >
> > As a quick and dirty test you can try:
> >
> > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> > index edc2448..1f78998 100644
> > --- a/arch/x86/kernel/apic/apic.c
> > +++ b/arch/x86/kernel/apic/apic.c
> > @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long address)
> > }
> > if (boot_cpu_physical_apicid == -1U) {
> > boot_cpu_physical_apicid = read_apic_id();
> > + boot_cpu_physical_apicid = 32;
> > apic_version[boot_cpu_physical_apicid] =
> > GET_APIC_VERSION(apic_read(APIC_LVR));
> > }
> >
> >
> > (Set it to whatever APICID on core0 is, I suspect it won't be zero).
> >
>
> Indeed, when I hack the above id to be 0x10 (as my dmesg tells) most of the BIOS
> bug messages are gone and the xen-acpi-processor driver successfully loads and
> seems to be switching frequencies ok (just a quick tight loop which made one cpu
> go from P4 to P0).

OK. Can you try the attached patch pls? It should do the same thing
as what the debug patch does - get the _real_ APIC ID from the hypervisor.
(And as bonus it also removes the annoying:

BIOS bug: APIC version is 0 for CPU 0/0x0, fixing up to 0x10


diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index a8f8844..d816448 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -811,7 +811,29 @@ static void xen_io_delay(void)
#ifdef CONFIG_X86_LOCAL_APIC
static u32 xen_apic_read(u32 reg)
{
- return 0;
+ struct xen_platform_op op = {
+ .cmd = XENPF_get_cpuinfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.pcpu_info.xen_cpuid = 0,
+ };
+ int ret = 0;
+
+ /* Shouldn't need this as APIC is turned off for PV, and we only
+ * get called on the bootup processor. But just in case. */
+ if (!xen_initial_domain() || smp_processor_id())
+ return 0;
+
+ if (reg == APIC_LVR)
+ return 0x10;
+
+ if (reg != APIC_ID)
+ return 0;
+
+ ret = HYPERVISOR_dom0_op(&op);
+ if (ret)
+ return 0;
+
+ return op.u.pcpu_info.apic_id;
}

static void xen_apic_write(u32 reg, u32 val)

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


boris.ostrovsky at amd

May 2, 2012, 10:06 AM

Post #16 of 33 (1245 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index a8f8844..d816448 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
> #ifdef CONFIG_X86_LOCAL_APIC
> static u32 xen_apic_read(u32 reg)
> {
> - return 0;
> + struct xen_platform_op op = {
> + .cmd = XENPF_get_cpuinfo,
> + .interface_version = XENPF_INTERFACE_VERSION,
> + .u.pcpu_info.xen_cpuid = 0,


Is this always zero? This will probably solve the current problem but I
am wondering whether in the future we might hit another bug because this
routine will return the same APICID for all VCPUs.

-boris


> + };
> + int ret = 0;
> +
> + /* Shouldn't need this as APIC is turned off for PV, and we only
> + * get called on the bootup processor. But just in case. */
> + if (!xen_initial_domain() || smp_processor_id())
> + return 0;
> +
> + if (reg == APIC_LVR)
> + return 0x10;
> +
> + if (reg != APIC_ID)
> + return 0;
> +
> + ret = HYPERVISOR_dom0_op(&op);
> + if (ret)
> + return 0;
> +
> + return op.u.pcpu_info.apic_id;
> }
>
> static void xen_apic_write(u32 reg, u32 val)
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


konrad.wilk at oracle

May 2, 2012, 10:14 AM

Post #17 of 33 (1241 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
> On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
> >diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> >index a8f8844..d816448 100644
> >--- a/arch/x86/xen/enlighten.c
> >+++ b/arch/x86/xen/enlighten.c
> >@@ -811,7 +811,29 @@ static void xen_io_delay(void)
> > #ifdef CONFIG_X86_LOCAL_APIC
> > static u32 xen_apic_read(u32 reg)
> > {
> >- return 0;
> >+ struct xen_platform_op op = {
> >+ .cmd = XENPF_get_cpuinfo,
> >+ .interface_version = XENPF_INTERFACE_VERSION,
> >+ .u.pcpu_info.xen_cpuid = 0,
>
>
> Is this always zero? This will probably solve the current problem

Its a CPU number (not tied in to APIC or ACPI IDs).

> but I am wondering whether in the future we might hit another bug
> because this routine will return the same APICID for all VCPUs.

Later on it does a check for 'smp_processor_id()' - and if
that is anything but zero it will bail out.

So this shoudl solve the problem for the bootup processor.
>
> -boris
>
>
> >+ };
> >+ int ret = 0;
> >+
> >+ /* Shouldn't need this as APIC is turned off for PV, and we only
> >+ * get called on the bootup processor. But just in case. */
> >+ if (!xen_initial_domain() || smp_processor_id())
> >+ return 0;
> >+
> >+ if (reg == APIC_LVR)
> >+ return 0x10;
> >+
> >+ if (reg != APIC_ID)
> >+ return 0;
> >+
> >+ ret = HYPERVISOR_dom0_op(&op);
> >+ if (ret)
> >+ return 0;
> >+
> >+ return op.u.pcpu_info.apic_id;
> > }
> >
> > static void xen_apic_write(u32 reg, u32 val)
> >
>

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


stefan.bader at canonical

May 2, 2012, 2:29 PM

Post #18 of 33 (1239 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 02.05.2012 18:08, Konrad Rzeszutek Wilk wrote:
> On Wed, May 02, 2012 at 05:01:00PM +0200, Stefan Bader wrote:
>> On 02.05.2012 00:35, Boris Ostrovsky wrote:
>>> On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
>>>> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>>>>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>>>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>>>>> was interested to find out what or how much would be gained by that.
>>>>>>>
>>>>>>> The first system I tried was an AMD based one (8 core Opteron 6128 [at] 2GH).
>>>>>>> Which
>>>>>>> was not very successful as the drivers bail out of the init function
>>>>>>> because the
>>>>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>>>>> some frequency scaling when running without Xen, so I need to do some more
>>>>>>> debugging there.
>>>
>>> I believe this is caused by the somewhat under-enlightened xen_apic_read():
>>>
>>> static u32 xen_apic_read(u32 reg)
>>> {
>>> return 0;
>>> }
>>>
>>> This results in some data, most importantly boot_cpu_physical_apicid, not being
>>> set correctly and, in turn, causes x86_cpu_to_apicid to be broken.
>>>
>>> On larger AMD systems boot processor is typically APICID=0x20 (I don't have
>>> Intel system handy to see how it looks there).
>>>
>>> As a quick and dirty test you can try:
>>>
>>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>>> index edc2448..1f78998 100644
>>> --- a/arch/x86/kernel/apic/apic.c
>>> +++ b/arch/x86/kernel/apic/apic.c
>>> @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long address)
>>> }
>>> if (boot_cpu_physical_apicid == -1U) {
>>> boot_cpu_physical_apicid = read_apic_id();
>>> + boot_cpu_physical_apicid = 32;
>>> apic_version[boot_cpu_physical_apicid] =
>>> GET_APIC_VERSION(apic_read(APIC_LVR));
>>> }
>>>
>>>
>>> (Set it to whatever APICID on core0 is, I suspect it won't be zero).
>>>
>>
>> Indeed, when I hack the above id to be 0x10 (as my dmesg tells) most of the BIOS
>> bug messages are gone and the xen-acpi-processor driver successfully loads and
>> seems to be switching frequencies ok (just a quick tight loop which made one cpu
>> go from P4 to P0).
>
> OK. Can you try the attached patch pls? It should do the same thing
> as what the debug patch does - get the _real_ APIC ID from the hypervisor.
> (And as bonus it also removes the annoying:
>
> BIOS bug: APIC version is 0 for CPU 0/0x0, fixing up to 0x10

All the BIOS bug messages are gone from the Intel and the AMD box. Just the AMD
box again fails to load the xen-acpi-processor driver. Though it must be a
different exit. I had enabled acpi debugging for the modprobe call but this time
there is no trace of any acpi messages.

>
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index a8f8844..d816448 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
> #ifdef CONFIG_X86_LOCAL_APIC
> static u32 xen_apic_read(u32 reg)
> {
> - return 0;
> + struct xen_platform_op op = {
> + .cmd = XENPF_get_cpuinfo,
> + .interface_version = XENPF_INTERFACE_VERSION,
> + .u.pcpu_info.xen_cpuid = 0,
> + };
> + int ret = 0;
> +
> + /* Shouldn't need this as APIC is turned off for PV, and we only
> + * get called on the bootup processor. But just in case. */
> + if (!xen_initial_domain() || smp_processor_id())
> + return 0;
> +
> + if (reg == APIC_LVR)
> + return 0x10;
> +
> + if (reg != APIC_ID)
> + return 0;
> +
> + ret = HYPERVISOR_dom0_op(&op);
> + if (ret)
> + return 0;
> +
> + return op.u.pcpu_info.apic_id;
> }
>
> static void xen_apic_write(u32 reg, u32 val)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
Attachments: dmesg.xen.4 (61.2 KB)
  signature.asc (0.88 KB)


boris.ostrovsky at amd

May 2, 2012, 2:31 PM

Post #19 of 33 (1243 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 05/02/2012 01:14 PM, Konrad Rzeszutek Wilk wrote:
> On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
>> On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>> index a8f8844..d816448 100644
>>> --- a/arch/x86/xen/enlighten.c
>>> +++ b/arch/x86/xen/enlighten.c
>>> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
>>> #ifdef CONFIG_X86_LOCAL_APIC
>>> static u32 xen_apic_read(u32 reg)
>>> {
>>> - return 0;
>>> + struct xen_platform_op op = {
>>> + .cmd = XENPF_get_cpuinfo,
>>> + .interface_version = XENPF_INTERFACE_VERSION,
>>> + .u.pcpu_info.xen_cpuid = 0,
>>
>>
>> Is this always zero? This will probably solve the current problem
>
> Its a CPU number (not tied in to APIC or ACPI IDs).

Why not use CPU number instead of zero here?

>
>> but I am wondering whether in the future we might hit another bug
>> because this routine will return the same APICID for all VCPUs.
>
> Later on it does a check for 'smp_processor_id()' - and if
> that is anything but zero it will bail out.

Can you point me to the check you are referring to?

-boris


>
> So this shoudl solve the problem for the bootup processor.
>>
>> -boris
>>
>>
>>> + };
>>> + int ret = 0;
>>> +
>>> + /* Shouldn't need this as APIC is turned off for PV, and we only
>>> + * get called on the bootup processor. But just in case. */
>>> + if (!xen_initial_domain() || smp_processor_id())
>>> + return 0;
>>> +
>>> + if (reg == APIC_LVR)
>>> + return 0x10;
>>> +
>>> + if (reg != APIC_ID)
>>> + return 0;
>>> +
>>> + ret = HYPERVISOR_dom0_op(&op);
>>> + if (ret)
>>> + return 0;
>>> +
>>> + return op.u.pcpu_info.apic_id;
>>> }
>>>
>>> static void xen_apic_write(u32 reg, u32 val)
>>>
>>
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


konrad.wilk at oracle

May 2, 2012, 2:41 PM

Post #20 of 33 (1245 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On Wed, May 02, 2012 at 02:31:07PM -0700, Boris Ostrovsky wrote:
> On 05/02/2012 01:14 PM, Konrad Rzeszutek Wilk wrote:
> >On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
> >>On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
> >>>diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> >>>index a8f8844..d816448 100644
> >>>--- a/arch/x86/xen/enlighten.c
> >>>+++ b/arch/x86/xen/enlighten.c
> >>>@@ -811,7 +811,29 @@ static void xen_io_delay(void)
> >>> #ifdef CONFIG_X86_LOCAL_APIC
> >>> static u32 xen_apic_read(u32 reg)
> >>> {
> >>>- return 0;
> >>>+ struct xen_platform_op op = {
> >>>+ .cmd = XENPF_get_cpuinfo,
> >>>+ .interface_version = XENPF_INTERFACE_VERSION,
> >>>+ .u.pcpu_info.xen_cpuid = 0,
> >>
> >>
> >>Is this always zero? This will probably solve the current problem
> >
> >Its a CPU number (not tied in to APIC or ACPI IDs).
>
> Why not use CPU number instead of zero here?

The issue was only with the bootup CPU - so was using the Xen's
bootup CPU number - which is zero (as is Linux's).

>
> >
> >>but I am wondering whether in the future we might hit another bug
> >>because this routine will return the same APICID for all VCPUs.
> >
> > Later on it does a check for 'smp_processor_id()' - and if
> >that is anything but zero it will bail out.
>
> Can you point me to the check you are referring to?

if (!xen_initial_domain() || smp_processor_id())


>
> -boris
>
>
> >
> >So this shoudl solve the problem for the bootup processor.
> >>
> >>-boris
> >>
> >>
> >>>+ };
> >>>+ int ret = 0;
> >>>+
> >>>+ /* Shouldn't need this as APIC is turned off for PV, and we only
> >>>+ * get called on the bootup processor. But just in case. */
> >>>+ if (!xen_initial_domain() || smp_processor_id())
> >>>+ return 0;
> >>>+
> >>>+ if (reg == APIC_LVR)
> >>>+ return 0x10;
> >>>+
> >>>+ if (reg != APIC_ID)
> >>>+ return 0;
> >>>+
> >>>+ ret = HYPERVISOR_dom0_op(&op);
> >>>+ if (ret)
> >>>+ return 0;
> >>>+
> >>>+ return op.u.pcpu_info.apic_id;
> >>> }
> >>>
> >>> static void xen_apic_write(u32 reg, u32 val)
> >>>
> >>
> >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


boris.ostrovsky at amd

May 2, 2012, 3:09 PM

Post #21 of 33 (1242 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 05/02/2012 05:41 PM, Konrad Rzeszutek Wilk wrote:
> On Wed, May 02, 2012 at 02:31:07PM -0700, Boris Ostrovsky wrote:
>> On 05/02/2012 01:14 PM, Konrad Rzeszutek Wilk wrote:
>>> On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
>>>> On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>> index a8f8844..d816448 100644
>>>>> --- a/arch/x86/xen/enlighten.c
>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
>>>>> #ifdef CONFIG_X86_LOCAL_APIC
>>>>> static u32 xen_apic_read(u32 reg)
>>>>> {
>>>>> - return 0;
>>>>> + struct xen_platform_op op = {
>>>>> + .cmd = XENPF_get_cpuinfo,
>>>>> + .interface_version = XENPF_INTERFACE_VERSION,
>>>>> + .u.pcpu_info.xen_cpuid = 0,
>>>>
>>>>
>>>> Is this always zero? This will probably solve the current problem
>>>
>>> Its a CPU number (not tied in to APIC or ACPI IDs).
>>
>> Why not use CPU number instead of zero here?
>
> The issue was only with the bootup CPU - so was using the Xen's
> bootup CPU number - which is zero (as is Linux's).

I agree that for this particular problem this may be sufficient.

My concern is that in the future someone may decide to use
apic_read(APIC_ID) or read_apic_id() for some other purpose and they
won't get expected result (i.e. on all CPUs they will get the same answer).

>
>>
>>>
>>>> but I am wondering whether in the future we might hit another bug
>>>> because this routine will return the same APICID for all VCPUs.
>>>
>>> Later on it does a check for 'smp_processor_id()' - and if
>>> that is anything but zero it will bail out.
>>
>> Can you point me to the check you are referring to?
>
> if (!xen_initial_domain() || smp_processor_id())

I don't see this line --- neither in the mainline nor in your kernel.
Which kernel and which routine is this in?

BTW, this patch doesn't quite work, xen-acpi-processor driver fails to
load with the same error as before. I'll look at this tomorrow more
carefully.


-boris

>
>
>>
>> -boris
>>
>>
>>>
>>> So this shoudl solve the problem for the bootup processor.
>>>>
>>>> -boris
>>>>
>>>>
>>>>> + };
>>>>> + int ret = 0;
>>>>> +
>>>>> + /* Shouldn't need this as APIC is turned off for PV, and we only
>>>>> + * get called on the bootup processor. But just in case. */
>>>>> + if (!xen_initial_domain() || smp_processor_id())
>>>>> + return 0;
>>>>> +
>>>>> + if (reg == APIC_LVR)
>>>>> + return 0x10;
>>>>> +
>>>>> + if (reg != APIC_ID)
>>>>> + return 0;
>>>>> +
>>>>> + ret = HYPERVISOR_dom0_op(&op);
>>>>> + if (ret)
>>>>> + return 0;
>>>>> +
>>>>> + return op.u.pcpu_info.apic_id;
>>>>> }
>>>>>
>>>>> static void xen_apic_write(u32 reg, u32 val)
>>>>>
>>>>
>>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel [at] lists
>> http://lists.xen.org/xen-devel
>



_______________________________________________
Xen-devel mailing list
Xen-devel [at] lists
http://lists.xen.org/xen-devel


stefan.bader at canonical

May 2, 2012, 11:55 PM

Post #22 of 33 (1242 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 03.05.2012 00:09, Boris Ostrovsky wrote:
> On 05/02/2012 05:41 PM, Konrad Rzeszutek Wilk wrote:
>> On Wed, May 02, 2012 at 02:31:07PM -0700, Boris Ostrovsky wrote:
>>> On 05/02/2012 01:14 PM, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
>>>>> On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
>>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>>> index a8f8844..d816448 100644
>>>>>> --- a/arch/x86/xen/enlighten.c
>>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>>> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
>>>>>> #ifdef CONFIG_X86_LOCAL_APIC
>>>>>> static u32 xen_apic_read(u32 reg)
>>>>>> {
>>>>>> - return 0;
>>>>>> + struct xen_platform_op op = {
>>>>>> + .cmd = XENPF_get_cpuinfo,
>>>>>> + .interface_version = XENPF_INTERFACE_VERSION,
>>>>>> + .u.pcpu_info.xen_cpuid = 0,
>>>>>
>>>>>
>>>>> Is this always zero? This will probably solve the current problem
>>>>
>>>> Its a CPU number (not tied in to APIC or ACPI IDs).
>>>
>>> Why not use CPU number instead of zero here?
>>
>> The issue was only with the bootup CPU - so was using the Xen's
>> bootup CPU number - which is zero (as is Linux's).
>
> I agree that for this particular problem this may be sufficient.
>
> My concern is that in the future someone may decide to use apic_read(APIC_ID) or
> read_apic_id() for some other purpose and they won't get expected result (i.e.
> on all CPUs they will get the same answer).
>
>>
>>>
>>>>
>>>>> but I am wondering whether in the future we might hit another bug
>>>>> because this routine will return the same APICID for all VCPUs.
>>>>
>>>> Later on it does a check for 'smp_processor_id()' - and if
>>>> that is anything but zero it will bail out.
>>>
>>> Can you point me to the check you are referring to?
>>
>> if (!xen_initial_domain() || smp_processor_id())
>
> I don't see this line --- neither in the mainline nor in your kernel. Which
> kernel and which routine is this in?

This is in the inline patch Konrad asked me to check.
>
> BTW, this patch doesn't quite work, xen-acpi-processor driver fails to load with
> the same error as before. I'll look at this tomorrow more carefully.

It does fail but at least for me it seems slightly different. I do the modprobe
after turning on all acpi debugging levels and layers. And without any change
there are queries visible. With that patch the driver just fails but there are
no queries. I plan to have another go with messages sprinkled to all exit paths
today, too (was just a bit too late and two pints later yesterday).


-Stefan
>
>
> -boris
>
>>
>>
>>>
>>> -boris
>>>
>>>
>>>>
>>>> So this shoudl solve the problem for the bootup processor.
>>>>>
>>>>> -boris
>>>>>
>>>>>
>>>>>> + };
>>>>>> + int ret = 0;
>>>>>> +
>>>>>> + /* Shouldn't need this as APIC is turned off for PV, and we only
>>>>>> + * get called on the bootup processor. But just in case. */
>>>>>> + if (!xen_initial_domain() || smp_processor_id())
>>>>>> + return 0;
>>>>>> +
>>>>>> + if (reg == APIC_LVR)
>>>>>> + return 0x10;
>>>>>> +
>>>>>> + if (reg != APIC_ID)
>>>>>> + return 0;
>>>>>> +
>>>>>> + ret = HYPERVISOR_dom0_op(&op);
>>>>>> + if (ret)
>>>>>> + return 0;
>>>>>> +
>>>>>> + return op.u.pcpu_info.apic_id;
>>>>>> }
>>>>>>
>>>>>> static void xen_apic_write(u32 reg, u32 val)
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel [at] lists
>>> http://lists.xen.org/xen-devel
>>
>
>
Attachments: signature.asc (0.88 KB)


stefan.bader at canonical

May 3, 2012, 3:00 AM

Post #23 of 33 (1245 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 03.05.2012 08:55, Stefan Bader wrote:
> On 03.05.2012 00:09, Boris Ostrovsky wrote:
>> On 05/02/2012 05:41 PM, Konrad Rzeszutek Wilk wrote:
>>> On Wed, May 02, 2012 at 02:31:07PM -0700, Boris Ostrovsky wrote:
>>>> On 05/02/2012 01:14 PM, Konrad Rzeszutek Wilk wrote:
>>>>> On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
>>>>>> On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
>>>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>>>> index a8f8844..d816448 100644
>>>>>>> --- a/arch/x86/xen/enlighten.c
>>>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>>>> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
>>>>>>> #ifdef CONFIG_X86_LOCAL_APIC
>>>>>>> static u32 xen_apic_read(u32 reg)
>>>>>>> {
>>>>>>> - return 0;
>>>>>>> + struct xen_platform_op op = {
>>>>>>> + .cmd = XENPF_get_cpuinfo,
>>>>>>> + .interface_version = XENPF_INTERFACE_VERSION,
>>>>>>> + .u.pcpu_info.xen_cpuid = 0,
>>>>>>
>>>>>>
>>>>>> Is this always zero? This will probably solve the current problem
>>>>>
>>>>> Its a CPU number (not tied in to APIC or ACPI IDs).
>>>>
>>>> Why not use CPU number instead of zero here?
>>>
>>> The issue was only with the bootup CPU - so was using the Xen's
>>> bootup CPU number - which is zero (as is Linux's).
>>
>> I agree that for this particular problem this may be sufficient.
>>
>> My concern is that in the future someone may decide to use apic_read(APIC_ID) or
>> read_apic_id() for some other purpose and they won't get expected result (i.e.
>> on all CPUs they will get the same answer).
>>
>>>
>>>>
>>>>>
>>>>>> but I am wondering whether in the future we might hit another bug
>>>>>> because this routine will return the same APICID for all VCPUs.
>>>>>
>>>>> Later on it does a check for 'smp_processor_id()' - and if
>>>>> that is anything but zero it will bail out.
>>>>
>>>> Can you point me to the check you are referring to?
>>>
>>> if (!xen_initial_domain() || smp_processor_id())
>>
>> I don't see this line --- neither in the mainline nor in your kernel. Which
>> kernel and which routine is this in?
>
> This is in the inline patch Konrad asked me to check.
>>
>> BTW, this patch doesn't quite work, xen-acpi-processor driver fails to load with
>> the same error as before. I'll look at this tomorrow more carefully.
>
> It does fail but at least for me it seems slightly different. I do the modprobe
> after turning on all acpi debugging levels and layers. And without any change
> there are queries visible. With that patch the driver just fails but there are
> no queries. I plan to have another go with messages sprinkled to all exit paths
> today, too (was just a bit too late and two pints later yesterday).

Gah! Once you do it right... It *does* fail the exactly same way and there are
some acpi debug messages...

>
>
> -Stefan
>>
>>
>> -boris
>>
>>>
>>>
>>>>
>>>> -boris
>>>>
>>>>
>>>>>
>>>>> So this shoudl solve the problem for the bootup processor.
>>>>>>
>>>>>> -boris
>>>>>>
>>>>>>
>>>>>>> + };
>>>>>>> + int ret = 0;
>>>>>>> +
>>>>>>> + /* Shouldn't need this as APIC is turned off for PV, and we only
>>>>>>> + * get called on the bootup processor. But just in case. */
>>>>>>> + if (!xen_initial_domain() || smp_processor_id())
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + if (reg == APIC_LVR)
>>>>>>> + return 0x10;
>>>>>>> +
>>>>>>> + if (reg != APIC_ID)
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + ret = HYPERVISOR_dom0_op(&op);
>>>>>>> + if (ret)
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + return op.u.pcpu_info.apic_id;
>>>>>>> }
>>>>>>>
>>>>>>> static void xen_apic_write(u32 reg, u32 val)
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel [at] lists
>>>> http://lists.xen.org/xen-devel
>>>
>>
>>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
Attachments: dmesg.xen.4 (223 KB)
  signature.asc (0.88 KB)


stefan.bader at canonical

May 3, 2012, 5:58 AM

Post #24 of 33 (1237 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 03.05.2012 00:09, Boris Ostrovsky wrote:
> On 05/02/2012 05:41 PM, Konrad Rzeszutek Wilk wrote:
>> On Wed, May 02, 2012 at 02:31:07PM -0700, Boris Ostrovsky wrote:
>>> On 05/02/2012 01:14 PM, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, May 02, 2012 at 01:06:34PM -0400, Boris Ostrovsky wrote:
>>>>> On 05/02/2012 12:08 PM, Konrad Rzeszutek Wilk wrote:
>>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>>> index a8f8844..d816448 100644
>>>>>> --- a/arch/x86/xen/enlighten.c
>>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>>> @@ -811,7 +811,29 @@ static void xen_io_delay(void)
>>>>>> #ifdef CONFIG_X86_LOCAL_APIC
>>>>>> static u32 xen_apic_read(u32 reg)
>>>>>> {
>>>>>> - return 0;
>>>>>> + struct xen_platform_op op = {
>>>>>> + .cmd = XENPF_get_cpuinfo,
>>>>>> + .interface_version = XENPF_INTERFACE_VERSION,
>>>>>> + .u.pcpu_info.xen_cpuid = 0,
>>>>>
>>>>>
>>>>> Is this always zero? This will probably solve the current problem
>>>>
>>>> Its a CPU number (not tied in to APIC or ACPI IDs).
>>>
>>> Why not use CPU number instead of zero here?
>>
>> The issue was only with the bootup CPU - so was using the Xen's
>> bootup CPU number - which is zero (as is Linux's).
>
> I agree that for this particular problem this may be sufficient.
>
> My concern is that in the future someone may decide to use apic_read(APIC_ID) or
> read_apic_id() for some other purpose and they won't get expected result (i.e.
> on all CPUs they will get the same answer).
>
>>
>>>
>>>>
>>>>> but I am wondering whether in the future we might hit another bug
>>>>> because this routine will return the same APICID for all VCPUs.
>>>>
>>>> Later on it does a check for 'smp_processor_id()' - and if
>>>> that is anything but zero it will bail out.
>>>
>>> Can you point me to the check you are referring to?
>>
>> if (!xen_initial_domain() || smp_processor_id())
>
> I don't see this line --- neither in the mainline nor in your kernel. Which
> kernel and which routine is this in?
>
> BTW, this patch doesn't quite work, xen-acpi-processor driver fails to load with
> the same error as before. I'll look at this tomorrow more carefully.
>
>
> -boris
>
>>
>>
>>>
>>> -boris
>>>
>>>
>>>>
>>>> So this shoudl solve the problem for the bootup processor.
>>>>>
>>>>> -boris
>>>>>
>>>>>
>>>>>> + };
>>>>>> + int ret = 0;
>>>>>> +
>>>>>> + /* Shouldn't need this as APIC is turned off for PV, and we only
>>>>>> + * get called on the bootup processor. But just in case. */
>>>>>> + if (!xen_initial_domain() || smp_processor_id())
>>>>>> + return 0;
>>>>>> +
>>>>>> + if (reg == APIC_LVR)
>>>>>> + return 0x10;
>>>>>> +
>>>>>> + if (reg != APIC_ID)
>>>>>> + return 0;
>>>>>> +
>>>>>> + ret = HYPERVISOR_dom0_op(&op);
>>>>>> + if (ret)
>>>>>> + return 0;
>>>>>> +
>>>>>> + return op.u.pcpu_info.apic_id;
>>>>>> }
>>>>>>
>>>>>> static void xen_apic_write(u32 reg, u32 val)

I added debugging to all exit paths that could return 0 (which is what the
boot_cpu_physical_apicid is set to with that patch. Which would only leave the
case of the HV call returning the wrong value somehow...

>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel [at] lists
>>> http://lists.xen.org/xen-devel
>>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel [at] lists
> http://lists.xen.org/xen-devel
Attachments: signature.asc (0.88 KB)


stefan.bader at canonical

May 3, 2012, 7:47 AM

Post #25 of 33 (1240 views)
Permalink
Re: Workings/effectiveness of the xen-acpi-processor driver [In reply to]

On 03.05.2012 14:58, Stefan Bader wrote:

>>>>> So this shoudl solve the problem for the bootup processor.
>>>>>>
>>>>>> -boris
>>>>>>
>>>>>>
>>>>>>> + };
>>>>>>> + int ret = 0;
>>>>>>> +
>>>>>>> + /* Shouldn't need this as APIC is turned off for PV, and we only
>>>>>>> + * get called on the bootup processor. But just in case. */
>>>>>>> + if (!xen_initial_domain() || smp_processor_id())
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + if (reg == APIC_LVR)
>>>>>>> + return 0x10;
>>>>>>> +
>>>>>>> + if (reg != APIC_ID)
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + ret = HYPERVISOR_dom0_op(&op);
>>>>>>> + if (ret)
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + return op.u.pcpu_info.apic_id;
>>>>>>> }
>>>>>>>
>>>>>>> static void xen_apic_write(u32 reg, u32 val)
>
> I added debugging to all exit paths that could return 0 (which is what the
> boot_cpu_physical_apicid is set to with that patch. Which would only leave the
> case of the HV call returning the wrong value somehow...
>
Hmmm, so xen_apic_read is still correct...

[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] xxx xen_apic_read(20)
[ 0.000000] xxx xen_apic_read -> 10
[ 0.000000] boot_cpu_physical_apicid = 0
[ 0.000000] xxx xen_apic_read(30)
[ 0.000000] +- apic version = 10

there seems to be a slightly strange tweak (at least for me) in read_apic_id...

static inline unsigned int read_apic_id(void)
{
unsigned int reg;

reg = apic_read(APIC_ID); // calls apic->read(reg)

return apic->get_apic_id(reg);
}
Attachments: signature.asc (0.88 KB)

First page Previous page 1 2 Next page Last page  View All Xen devel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.