Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH] perf_events: fix read() bogus counts when in error state

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


eranian at google

Nov 26, 2009, 9:24 AM

Post #1 of 3 (124 views)
Permalink
[PATCH] perf_events: fix read() bogus counts when in error state

When a pinned group cannot be scheduled it goes into error state.
Normally a group cannot go out of error state without being explicitly
re-enabled or disabled. There was a bug in per-thread mode, whereby
upon termination of the thread, the group would transition from error
to off leading to bogus counts and timing information returned by
read().

It is important to realize that the current perf_events implementation
assigns higher priority to system-wide events over per-thread events
and that regardless of the fact that per-thread events may be pinned.
It is not clear to me whether this is per design of the API or just a
side effect of the implementation. I believe it is desirable that a
system-wide tool gets priority access to the PMU but then this causes
issues with per-thread events and especially when they request pinning.

A per-thread pinned event can be evicted until there is enough PMU
resource freed by system-wide events. Although, with this patch it is
now possible to detect this when counting, it remains unclear how this
situation could be detected when sampling, as it incurs potientially
large blind spots and thus bias degrading the quality of the data
collected.

The API is missing a clear definition of what it means to be pinned
for a per-thread event vs. system-wide event. Just like it does not
clearly state that system-wide event have higher priority than per
thread events.

Signed-off-by: Stephane Eranian <eranian [at] google>

---
perf_event.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0b0d5f7..7a8bb5b 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -333,7 +333,16 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
event->group_leader->nr_siblings--;

update_event_times(event);
- event->state = PERF_EVENT_STATE_OFF;
+
+ /*
+ * If event was in error state, then keep it
+ * that way, otherwise bogus counts will be
+ * returned on read(). The only way to get out
+ * of error state is by explicit re-enabling
+ * of the event
+ */
+ if (event->state > PERF_EVENT_STATE_OFF)
+ event->state = PERF_EVENT_STATE_OFF;

/*
* If this was a group event with sibling events then
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Nov 26, 2009, 9:36 AM

Post #2 of 3 (129 views)
Permalink
Re: [PATCH] perf_events: fix read() bogus counts when in error state [In reply to]

On Thu, 2009-11-26 at 09:24 -0800, Stephane Eranian wrote:
> When a pinned group cannot be scheduled it goes into error state.
> Normally a group cannot go out of error state without being explicitly
> re-enabled or disabled. There was a bug in per-thread mode, whereby
> upon termination of the thread, the group would transition from error
> to off leading to bogus counts and timing information returned by
> read().

> Signed-off-by: Stephane Eranian <eranian [at] google>

Right, good catch, totally forgot about error state :/

Acked-by: Peter Zijlstra <a.p.zijlstra [at] chello>

> ---
> perf_event.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 0b0d5f7..7a8bb5b 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -333,7 +333,16 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
> event->group_leader->nr_siblings--;
>
> update_event_times(event);
> - event->state = PERF_EVENT_STATE_OFF;
> +
> + /*
> + * If event was in error state, then keep it
> + * that way, otherwise bogus counts will be
> + * returned on read(). The only way to get out
> + * of error state is by explicit re-enabling
> + * of the event
> + */
> + if (event->state > PERF_EVENT_STATE_OFF)
> + event->state = PERF_EVENT_STATE_OFF;
>
> /*
> * If this was a group event with sibling events then


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


eranian at google

Nov 26, 2009, 9:48 AM

Post #3 of 3 (129 views)
Permalink
Re: [PATCH] perf_events: fix read() bogus counts when in error state [In reply to]

On Thu, Nov 26, 2009 at 6:36 PM, Peter Zijlstra <peterz [at] infradead> wrote:
> On Thu, 2009-11-26 at 09:24 -0800, Stephane Eranian wrote:
>>       When a pinned group cannot be scheduled it goes into error state.
>>       Normally a group cannot go out of error state without being explicitly
>>       re-enabled or disabled. There was a bug in per-thread mode, whereby
>>       upon termination of the thread, the group would transition from error
>>       to off leading to bogus counts and timing information returned by
>>       read().
>
>>       Signed-off-by: Stephane Eranian <eranian [at] google>
>
> Right, good catch, totally forgot about error state :/
>
You need to clarify what pin actually means for per-thread
and I'd like to see an explanation for what happens when
sampling.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.