Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


robert.richter at amd

Jun 19, 2012, 11:10 AM

Post #1 of 14 (217 views)
Permalink
[PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h

This patch adds northbridge counter support for AMD family 15h cpus.

The NB counter implementation and usage is in the same way as for
family 10h. Thus a nb event can now be selected as any other
performance counter event. As for family 10h the kernel selects only
one NB PMC per node by using the nb constraint handler.

Main part of this patch set is to rework current code in a way that
bit masks for counters can be used. Also, Intel's fixed counters have
been moved to Intel only code. This is since AMD nb counters start at
index 32 which leads to holes in the counter mask and causes conflicts
with fixed counters.

Another major change is the unification of AMD pmus and, where
possible, a family independent feature check based on cpuid.

It should also be mentioned that nb perfctrs do not support all bits
in the config value, see patch #10.

-Robert



Robert Richter (10):
perf, amd: Rework northbridge event constraints handler
perf, x86: Rework counter reservation code
perf, x86: Use bitmasks for generic counters
perf, x86: Rename Intel specific macros
perf, x86: Move Intel specific code to intel_pmu_init()
perf, amd: Unify AMD's generic and family 15h pmus
perf, amd: Generalize northbridge constraints code for family 15h
perf, amd: Enable northbridge counters on family 15h
perf, x86: Improve debug output in check_hw_exists()
perf, amd: Check northbridge event config value

arch/x86/include/asm/cpufeature.h | 2 +
arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/include/asm/perf_event.h | 26 ++-
arch/x86/kernel/cpu/perf_event.c | 129 +++++------
arch/x86/kernel/cpu/perf_event.h | 7 +
arch/x86/kernel/cpu/perf_event_amd.c | 368 +++++++++++++++++------------
arch/x86/kernel/cpu/perf_event_intel.c | 65 +++++-
arch/x86/kernel/cpu/perf_event_intel_ds.c | 4 +-
arch/x86/kernel/cpu/perf_event_p4.c | 8 +-
arch/x86/kvm/pmu.c | 22 +-
arch/x86/oprofile/op_model_amd.c | 4 +-
11 files changed, 374 insertions(+), 265 deletions(-)

--
1.7.8.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


eranian at google

Jun 20, 2012, 1:36 AM

Post #2 of 14 (217 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Tue, Jun 19, 2012 at 8:10 PM, Robert Richter <robert.richter [at] amd> wrote:
>
> This patch adds northbridge counter support for AMD family 15h cpus.
>
> The NB counter implementation and usage is in the same way as for
> family 10h. Thus a nb event can now be selected as any other
> performance counter event. As for family 10h the kernel selects only
> one NB PMC per node by using the nb constraint handler.
>
> Main part of this patch set is to rework current code in a way that
> bit masks for counters can be used. Also, Intel's fixed counters have
> been moved to Intel only code. This is since AMD nb counters start at
> index 32 which leads to holes in the counter mask and causes conflicts
> with fixed counters.
>

I dont' quite understand the design choice here. In Fam15h, there is a clean
design for the uncore PMU. It has its own distinct set of 4 counters. Unlike
Fam10h, where you program core counters to access the NB counters. So
why not like with Intel uncore, create a separate NB PMU which would
advertise its characteristics? That does not preclude re-using the existing
AMD-specific routines wherever possible. I think the advantage is that
muxing or starting/stopping of the core PMU would not affect uncore and
vice-versa for instance. Wouldn't this also alleviate the problems with
assigning indexes to uncore PMU counters?

> Another major change is the unification of AMD pmus and, where
> possible, a family independent feature check based on cpuid.
>
> It should also be mentioned that nb perfctrs do not support all bits
> in the config value, see patch #10.
>
> -Robert
>
>
>
> Robert Richter (10):
>  perf, amd: Rework northbridge event constraints handler
>  perf, x86: Rework counter reservation code
>  perf, x86: Use bitmasks for generic counters
>  perf, x86: Rename Intel specific macros
>  perf, x86: Move Intel specific code to intel_pmu_init()
>  perf, amd: Unify AMD's generic and family 15h pmus
>  perf, amd: Generalize northbridge constraints code for family 15h
>  perf, amd: Enable northbridge counters on family 15h
>  perf, x86: Improve debug output in check_hw_exists()
>  perf, amd: Check northbridge event config value
>
>  arch/x86/include/asm/cpufeature.h         |    2 +
>  arch/x86/include/asm/kvm_host.h           |    4 +-
>  arch/x86/include/asm/perf_event.h         |   26 ++-
>  arch/x86/kernel/cpu/perf_event.c          |  129 +++++------
>  arch/x86/kernel/cpu/perf_event.h          |    7 +
>  arch/x86/kernel/cpu/perf_event_amd.c      |  368
> +++++++++++++++++------------
>  arch/x86/kernel/cpu/perf_event_intel.c    |   65 +++++-
>  arch/x86/kernel/cpu/perf_event_intel_ds.c |    4 +-
>  arch/x86/kernel/cpu/perf_event_p4.c       |    8 +-
>  arch/x86/kvm/pmu.c                        |   22 +-
>  arch/x86/oprofile/op_model_amd.c          |    4 +-
>  11 files changed, 374 insertions(+), 265 deletions(-)
>
> --
> 1.7.8.4
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Jun 20, 2012, 1:54 AM

Post #3 of 14 (215 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, 2012-06-20 at 10:36 +0200, Stephane Eranian wrote:
>
> I dont' quite understand the design choice here. In Fam15h, there is a clean
> design for the uncore PMU. It has its own distinct set of 4 counters. Unlike
> Fam10h, where you program core counters to access the NB counters. So
> why not like with Intel uncore, create a separate NB PMU which would
> advertise its characteristics? That does not preclude re-using the existing
> AMD-specific routines wherever possible. I think the advantage is that
> muxing or starting/stopping of the core PMU would not affect uncore and
> vice-versa for instance. Wouldn't this also alleviate the problems with
> assigning indexes to uncore PMU counters?

Quite agreed, it also avoids making a trainwreck of the counter rotation
on overload.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


robert.richter at amd

Jun 20, 2012, 2:29 AM

Post #4 of 14 (211 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

Stephane,

On 20.06.12 10:23:48, Stephane Eranian wrote:
> I dont' quite understand the design choice here. In Fam15h, there is
> a clean design for the uncore PMU. It has its own distinct set of 4
> counters. Unlike Fam10h, where you program core counters to access
> the NB counters. So why not like with Intel uncore, create a
> separate NB PMU which would advertise its characteristics? That does
> not preclude re-using the existing AMD-specific routines wherever
> possible. I think the advantage is that muxing or starting/stopping
> of the core PMU would not affect uncore and vice-versa for
> instance. Wouldn't this also alleviate the problems with assigning
> indexes to uncore PMU counters?

I was thinking of creating a separate pmu too. There were 2
fundamental problems with it. First, the implementation would have
been different to the family 10h implementation. But besides the new
counter msr range and the per-node counter msrs there is not much
difference to perfctrs of family 10h and also family 15h. Otherwise NB
events would have been programmed different for both families.

Second, since nb perfctr are implemented the same way as core
counters, the same code would have been used. Thus multiple (two) x86
pmus (struct x86_pmu) would reside in parallel in the kernel. The
current implemenation is not designed for this. A complete rework of
the x86 perf implementation with impact to more code than this
implemetation was the main reason against that approach and for
choosing this design.

The main problem with my approach was the introduction of counter
masks and conflicts with Intel's fixed counter. I think my patches
address this in a clean way which also led to a bit code cleanup.
Another advantage I see is the unification of AMD pmus and a straight
AMD setup code that is not widely spread other multiple pmus. The
setup code uses cpuid and is family independent.

I generally could imagine to switch AMD NB implementation to an
uncore-like counter support. But then I would prefer a homogeneous
implementation over all AMD families. This would be a separate patch
set that is independent from family 15h nb counter implementation.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Jun 20, 2012, 2:38 AM

Post #5 of 14 (212 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, 2012-06-20 at 11:29 +0200, Robert Richter wrote:
> Second, since nb perfctr are implemented the same way as core
> counters, the same code would have been used. Thus multiple (two) x86
> pmus (struct x86_pmu) would reside in parallel in the kernel.

Well, no. The I take it the uncore counters are nb wide, thus you need
special goo to make counter rotation work properly, x86_pmu is unsuited
for that.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Jun 20, 2012, 2:41 AM

Post #6 of 14 (223 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, 2012-06-20 at 11:29 +0200, Robert Richter wrote:
> But then I would prefer a homogeneous
> implementation over all AMD families. This would be a separate patch
> set that is independent from family 15h nb counter implementation.

I realize you would prefer that, but I would really like fam15h to do
the right thing and maybe see if its possible to rework the fam10h code
to approach that, instead of the other way around.

Fam10h really is somewhat ugly, it would be ashame to carry that ugly
into fam15h which actually did the right thing here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


robert.richter at amd

Jun 20, 2012, 3:00 AM

Post #7 of 14 (211 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On 20.06.12 11:38:04, Peter Zijlstra wrote:
> On Wed, 2012-06-20 at 11:29 +0200, Robert Richter wrote:
> > Second, since nb perfctr are implemented the same way as core
> > counters, the same code would have been used. Thus multiple (two) x86
> > pmus (struct x86_pmu) would reside in parallel in the kernel.
>
> Well, no. The I take it the uncore counters are nb wide, thus you need
> special goo to make counter rotation work properly, x86_pmu is unsuited
> for that.

The code for nb and core counters is identical. There would be the
same nmi handler, same code to setup the event, same code to
start/stop cpus. The only difference are per-node msrs, even the msr
offset calculation is the same as for core counters on family 15h. It
would not make sense to duplicate all this code. And, as said, current
design does not fit to use x86_pmus in parallel or to easy reuse x86
functions. Separating nb counters would make the same sense as
implementing a separate pmu for fixed counters.

And wrt counter rotation, this only affects code to assign counters.
You don't need a separate pmu for this.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Jun 20, 2012, 3:16 AM

Post #8 of 14 (217 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, 2012-06-20 at 12:00 +0200, Robert Richter wrote:
>
> And wrt counter rotation, this only affects code to assign counters.
> You don't need a separate pmu for this.

It makes it easier though, on unplug you would otherwise need to filter
what events are uncore and migrate those to another online cpu of the
same nb instead of all events..

Also, what cpu on the NB receives the PMI?

Sure it can be done, just not pretty. Combine that with all the other
special casing like patches 3 and 10 and one really starts to wonder if
its all worth it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


eranian at google

Jun 20, 2012, 3:46 AM

Post #9 of 14 (213 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, Jun 20, 2012 at 12:00 PM, Robert Richter <robert.richter [at] amd> wrote:
> On 20.06.12 11:38:04, Peter Zijlstra wrote:
>> On Wed, 2012-06-20 at 11:29 +0200, Robert Richter wrote:
>> > Second, since nb perfctr are implemented the same way as core
>> > counters, the same code would have been used. Thus multiple (two) x86
>> > pmus (struct x86_pmu) would reside in parallel in the kernel.
>>
>> Well, no. The I take it the uncore counters are nb wide, thus you need
>> special goo to make counter rotation work properly, x86_pmu is unsuited
>> for that.
>
> The code for nb and core counters is identical. There would be the
> same nmi handler, same code to setup the event, same code to
> start/stop cpus. The only difference are per-node msrs, even the msr
> offset calculation is the same as for core counters on family 15h. It
> would not make sense to duplicate all this code. And, as said, current
> design does not fit to use x86_pmus in parallel or to easy reuse x86
> functions. Separating nb counters would make the same sense as
> implementing a separate pmu for fixed counters.
>
Being identical does not necessarily mean you have to copy the code,
you can also simply call it.

I don't see the explanation for the non-contiguous counter indexes.
What's that about? With a separate PMU, would you have that problem.
I see uncore CTL base MSRC001_0240, next is 0242, and so on. But
that's already the case with core counters on Fam15h.

As Peter said, having your own PMU would alleviate the need for
Patch 10. Those filters would simply not be visible to tools via
sysfs.

> And wrt counter rotation, this only affects code to assign counters.
> You don't need a separate pmu for this.
>
> -Robert
>
> --
> Advanced Micro Devices, Inc.
> Operating System Research Center
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


robert.richter at amd

Jun 20, 2012, 5:29 AM

Post #10 of 14 (214 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On 20.06.12 12:16:13, Peter Zijlstra wrote:
> Sure it can be done, just not pretty. Combine that with all the other
> special casing like patches 3 and 10 and one really starts to wonder if
> its all worth it.

I actually started writing the code by implementing a different pmu.
It turned out to be the wrong direction. The pmus would be almost
identical, just some different config values and a bit nb related
special code. But you can't really reuse the functions on a 2nd
running pmu, there are hard wired functions in the x86 pmu code and
x86_pmu ops do not fit for such a split. It would mean a complete
rework of x86 perf code. Really, I tried that already. And all this
effort just to implement nb counters? If someone is willing to help
here this would be ok, but I guess I would have to do all this on my
own. And to be fair, this effort was also not make for fixed counters,
pebs, bts, etc. Maybe the uncore implementation is different here, but
today is the first day the uncore patches are in tip.

I also do not see the advantage of a separate pmu. Just to have a
different msr base to avoid the use of counter masks and some
optimized pmu ops? Masks are wide spread used in the kernel and on x86
the bsf instruction takes not more than an increment. And switches in
the code paths to special nb code are not more expensive than other
switches for other special code.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


robert.richter at amd

Jun 20, 2012, 5:41 AM

Post #11 of 14 (221 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On 20.06.12 12:46:21, Stephane Eranian wrote:
> On Wed, Jun 20, 2012 at 12:00 PM, Robert Richter <robert.richter [at] amd> wrote:
> > On 20.06.12 11:38:04, Peter Zijlstra wrote:
> >> On Wed, 2012-06-20 at 11:29 +0200, Robert Richter wrote:
> >> > Second, since nb perfctr are implemented the same way as core
> >> > counters, the same code would have been used. Thus multiple (two) x86
> >> > pmus (struct x86_pmu) would reside in parallel in the kernel.
> >>
> >> Well, no. The I take it the uncore counters are nb wide, thus you need
> >> special goo to make counter rotation work properly, x86_pmu is unsuited
> >> for that.
> >
> > The code for nb and core counters is identical. There would be the
> > same nmi handler, same code to setup the event, same code to
> > start/stop cpus. The only difference are per-node msrs, even the msr
> > offset calculation is the same as for core counters on family 15h. It
> > would not make sense to duplicate all this code. And, as said, current
> > design does not fit to use x86_pmus in parallel or to easy reuse x86
> > functions. Separating nb counters would make the same sense as
> > implementing a separate pmu for fixed counters.
> >
> Being identical does not necessarily mean you have to copy the code,
> you can also simply call it.

You can't use two x86_pmu in parallel in the kernel. Code is not
designed for this. The effort of changing the code to support this is
very high.

> I don't see the explanation for the non-contiguous counter indexes.
> What's that about? With a separate PMU, would you have that problem.
> I see uncore CTL base MSRC001_0240, next is 0242, and so on. But
> that's already the case with core counters on Fam15h.

The counters reside in msrs MSRC001_0200 to MSRC001_027f with two msrs
per counter. This is room for 64 counters. NB counters start at index
32 which is MSRC001_0240.

> As Peter said, having your own PMU would alleviate the need for
> Patch 10. Those filters would simply not be visible to tools via
> sysfs.

That's what I explained in an earlier thread about pmu descriptions in
sysfs. It is not possible to describe a complex pmu in sysfs. My
preference that time was the use of pmu ops in userland and not a
single generic pmu that is configured by sysfs.

-Robert

--
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Jun 20, 2012, 8:54 AM

Post #12 of 14 (204 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, 2012-06-20 at 14:29 +0200, Robert Richter wrote:
> On 20.06.12 12:16:13, Peter Zijlstra wrote:
> > Sure it can be done, just not pretty. Combine that with all the other
> > special casing like patches 3 and 10 and one really starts to wonder if
> > its all worth it.
>
> I actually started writing the code by implementing a different pmu.
> It turned out to be the wrong direction. The pmus would be almost
> identical, just some different config values and a bit nb related
> special code. But you can't really reuse the functions on a 2nd
> running pmu, there are hard wired functions in the x86 pmu code and
> x86_pmu ops do not fit for such a split. It would mean a complete
> rework of x86 perf code. Really, I tried that already. And all this
> effort just to implement nb counters? If someone is willing to help
> here this would be ok, but I guess I would have to do all this on my
> own. And to be fair, this effort was also not make for fixed counters,
> pebs, bts, etc. Maybe the uncore implementation is different here, but
> today is the first day the uncore patches are in tip.

Yeah, the Intel uncore implements an entire new pmu. The code is a
little over the top because Intel went there and decided it was a good
thing to have numerous uncore pmus instead of 1, some in PCI space some
in MSR space.

Still their programming is similar to the core ones -- just like for
AMD.

Yeah, there's a little bit of 'duplicated' code, but that's unavoidable.

> I also do not see the advantage of a separate pmu. Just to have a
> different msr base to avoid the use of counter masks and some
> optimized pmu ops? Masks are wide spread used in the kernel and on x86
> the bsf instruction takes not more than an increment. And switches in
> the code paths to special nb code are not more expensive than other
> switches for other special code.

Well, as it stands this thing is almost certainly doing things wrong. An
uncore pmu wants to put all events for the same NB on the same cpu, not
on whatever cpu they are registered, otherwise event rotation doesn't
work right.

It also wants to migrate events to another cpu if the designated cpu
gets unplugged but there's still active cpus on the NB.

Furthermore, if the uncore does PMI, you want PMI steering, if it
doesn't do PMIs you want to poll the thing to avoid overflowing the
counter.

/me rummages on the interwebs to find the BKDG for Fam15h..

OK, it looks like it does do PMI and it broadcast interrupts to the
entire NB.. ok so that wants special magic too -- you might even want to
disallow sampling on the thing until someone has a good use-case for
that -- but you still need the PMI to deal with the counter overflow
stuff.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Jun 20, 2012, 9:08 AM

Post #13 of 14 (211 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, 2012-06-20 at 17:54 +0200, Peter Zijlstra wrote:
> Yeah, there's a little bit of 'duplicated' code, but that's
> unavoidable.

Also, you don't need to replicate the entire x86_pmu thing, most of that
is trying to share stuff between the various x86 core pmu things, like
p6,p4,intel,amd etc.. If you only support the one amd uncore things
become a lot easier.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


eranian at google

Jun 20, 2012, 9:21 AM

Post #14 of 14 (209 views)
Permalink
Re: [PATCH 00/10] perf, x86: Add northbridge counter support for AMD family 15h [In reply to]

On Wed, Jun 20, 2012 at 5:54 PM, Peter Zijlstra <peterz [at] infradead> wrote:
> On Wed, 2012-06-20 at 14:29 +0200, Robert Richter wrote:
>> On 20.06.12 12:16:13, Peter Zijlstra wrote:
>> > Sure it can be done, just not pretty. Combine that with all the other
>> > special casing like patches 3 and 10 and one really starts to wonder if
>> > its all worth it.
>>
>> I actually started writing the code by implementing a different pmu.
>> It turned out to be the wrong direction. The pmus would be almost
>> identical, just some different config values and a bit nb related
>> special code. But you can't really reuse the functions on a 2nd
>> running pmu, there are hard wired functions in the x86 pmu code and
>> x86_pmu ops do not fit for such a split. It would mean a complete
>> rework of x86 perf code. Really, I tried that already. And all this
>> effort just to implement nb counters? If someone is willing to help
>> here this would be ok, but I guess I would have to do all this on my
>> own. And to be fair, this effort was also not make for fixed counters,
>> pebs, bts, etc. Maybe the uncore implementation is different here, but
>> today is the first day the uncore patches are in tip.
>
> Yeah, the Intel uncore implements an entire new pmu. The code is a
> little over the top because Intel went there and decided it was a good
> thing to have numerous uncore pmus instead of 1, some in PCI space some
> in MSR space.
>
> Still their programming is similar to the core ones -- just like for
> AMD.
>
> Yeah, there's a little bit of 'duplicated' code, but that's unavoidable.
>
>> I also do not see the advantage of a separate pmu. Just to have a
>> different msr base to avoid the use of counter masks and some
>> optimized pmu ops? Masks are wide spread used in the kernel and on x86
>> the bsf instruction takes not more than an increment. And switches in
>> the code paths to special nb code are not more expensive than other
>> switches for other special code.
>
> Well, as it stands this thing is almost certainly doing things wrong. An
> uncore pmu wants to put all events for the same NB on the same cpu, not
> on whatever cpu they are registered, otherwise event rotation doesn't
> work right.
>
> It also wants to migrate events to another cpu if the designated cpu
> gets unplugged but there's still active cpus on the NB.
>
> Furthermore, if the uncore does PMI, you want PMI steering, if it
> doesn't do PMIs you want to poll the thing to avoid overflowing the
> counter.
>
> /me rummages on the interwebs to find the BKDG for Fam15h..
>
> OK, it looks like it does do PMI and it broadcast interrupts to the
> entire NB.. ok so that wants special magic too -- you might even want to
> disallow sampling on the thing until someone has a good use-case for
> that -- but you still need the PMI to deal with the counter overflow
> stuff.
>
I do have a good use-case for the broadcast interrupt especially
if the uncore is capable of counting some form of cycles. That
interrupt can be used to provide a unique vantage point across
all the CPUs. We could relative easily discover what each CPU
is doing at any one time almost with very good synchronization.
Of course, a lot of plumbing would be needed to gather the IPs
from all the CPUS into the single sampling buffer or maybe one
per CPU. If a CPU knows it is a possible target of uncore PMI,
it should not discard the interrupt, it should process it.

The other thing I don't know about the AMD uncore is whether
or not it does deliver the PMI in case the core(s) are halted.
Robert, any info on this in particular?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.