Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


gleb at redhat

Nov 23, 2009, 6:06 AM

Post #1 of 12 (340 views)
Permalink
[PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels

Do not preempt kernel. Just maintain counter to know if task can be rescheduled.
Asynchronous page fault may be delivered while spinlock is held or current
process can't be preempted for other reasons. KVM uses preempt_count() to check if preemptions is allowed and schedule other process if possible. This works
with preemptable kernels since they maintain accurate information about
preemptability in preempt_count. This patch make non-preemptable kernel
maintain accurate information in preempt_count too.

Signed-off-by: Gleb Natapov <gleb [at] redhat>
---
include/linux/hardirq.h | 14 +++-----------
include/linux/preempt.h | 22 ++++++++++++++++------
include/linux/sched.h | 4 ----
kernel/sched.c | 6 ------
lib/kernel_lock.c | 1 +
5 files changed, 20 insertions(+), 27 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 6d527ee..484ba38 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -2,9 +2,7 @@
#define LINUX_HARDIRQ_H

#include <linux/preempt.h>
-#ifdef CONFIG_PREEMPT
#include <linux/smp_lock.h>
-#endif
#include <linux/lockdep.h>
#include <linux/ftrace_irq.h>
#include <asm/hardirq.h>
@@ -92,13 +90,8 @@
*/
#define in_nmi() (preempt_count() & NMI_MASK)

-#if defined(CONFIG_PREEMPT)
-# define PREEMPT_INATOMIC_BASE kernel_locked()
-# define PREEMPT_CHECK_OFFSET 1
-#else
-# define PREEMPT_INATOMIC_BASE 0
-# define PREEMPT_CHECK_OFFSET 0
-#endif
+#define PREEMPT_CHECK_OFFSET 1
+#define PREEMPT_INATOMIC_BASE kernel_locked()

/*
* Are we running in atomic context? WARNING: this macro cannot
@@ -116,12 +109,11 @@
#define in_atomic_preempt_off() \
((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_CHECK_OFFSET)

+#define IRQ_EXIT_OFFSET (HARDIRQ_OFFSET-1)
#ifdef CONFIG_PREEMPT
# define preemptible() (preempt_count() == 0 && !irqs_disabled())
-# define IRQ_EXIT_OFFSET (HARDIRQ_OFFSET-1)
#else
# define preemptible() 0
-# define IRQ_EXIT_OFFSET HARDIRQ_OFFSET
#endif

#if defined(CONFIG_SMP) || defined(CONFIG_GENERIC_HARDIRQS)
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 72b1a10..7d039ca 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -82,14 +82,24 @@ do { \

#else

-#define preempt_disable() do { } while (0)
-#define preempt_enable_no_resched() do { } while (0)
-#define preempt_enable() do { } while (0)
+#define preempt_disable() \
+do { \
+ inc_preempt_count(); \
+ barrier(); \
+} while (0)
+
+#define preempt_enable() \
+do { \
+ barrier(); \
+ dec_preempt_count(); \
+} while (0)
+
+#define preempt_enable_no_resched() preempt_enable()
#define preempt_check_resched() do { } while (0)

-#define preempt_disable_notrace() do { } while (0)
-#define preempt_enable_no_resched_notrace() do { } while (0)
-#define preempt_enable_notrace() do { } while (0)
+#define preempt_disable_notrace() preempt_disable()
+#define preempt_enable_no_resched_notrace() preempt_enable()
+#define preempt_enable_notrace() preempt_enable()

#endif

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 75e6e60..1895486 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2379,11 +2379,7 @@ extern int _cond_resched(void);

extern int __cond_resched_lock(spinlock_t *lock);

-#ifdef CONFIG_PREEMPT
#define PREEMPT_LOCK_OFFSET PREEMPT_OFFSET
-#else
-#define PREEMPT_LOCK_OFFSET 0
-#endif

#define cond_resched_lock(lock) ({ \
__might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \
diff --git a/kernel/sched.c b/kernel/sched.c
index 3c11ae0..92ce282 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2590,10 +2590,8 @@ void sched_fork(struct task_struct *p, int clone_flags)
#if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW)
p->oncpu = 0;
#endif
-#ifdef CONFIG_PREEMPT
/* Want to start with kernel preemption disabled. */
task_thread_info(p)->preempt_count = 1;
-#endif
plist_node_init(&p->pushable_tasks, MAX_PRIO);

put_cpu();
@@ -6973,11 +6971,7 @@ void __cpuinit init_idle(struct task_struct *idle, int cpu)
spin_unlock_irqrestore(&rq->lock, flags);

/* Set the preempt count _outside_ the spinlocks! */
-#if defined(CONFIG_PREEMPT)
task_thread_info(idle)->preempt_count = (idle->lock_depth >= 0);
-#else
- task_thread_info(idle)->preempt_count = 0;
-#endif
/*
* The idle tasks have their own, simple scheduling class:
*/
diff --git a/lib/kernel_lock.c b/lib/kernel_lock.c
index 39f1029..6e2659d 100644
--- a/lib/kernel_lock.c
+++ b/lib/kernel_lock.c
@@ -93,6 +93,7 @@ static inline void __lock_kernel(void)
*/
static inline void __lock_kernel(void)
{
+ preempt_disable();
_raw_spin_lock(&kernel_flag);
}
#endif
--
1.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Nov 23, 2009, 7:34 AM

Post #2 of 12 (322 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Mon, 2009-11-23 at 16:06 +0200, Gleb Natapov wrote:
> Do not preempt kernel. Just maintain counter to know if task can be rescheduled.
> Asynchronous page fault may be delivered while spinlock is held or current
> process can't be preempted for other reasons. KVM uses preempt_count() to check if preemptions is allowed and schedule other process if possible. This works
> with preemptable kernels since they maintain accurate information about
> preemptability in preempt_count. This patch make non-preemptable kernel
> maintain accurate information in preempt_count too.

I'm thinking you're going to have to convince some people this won't
slow them down for no good.

Personally I always have PREEMPT=y, but other people seem to feel
strongly about not doing so.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gleb at redhat

Nov 23, 2009, 7:58 AM

Post #3 of 12 (323 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Mon, Nov 23, 2009 at 04:34:15PM +0100, Peter Zijlstra wrote:
> On Mon, 2009-11-23 at 16:06 +0200, Gleb Natapov wrote:
> > Do not preempt kernel. Just maintain counter to know if task can be rescheduled.
> > Asynchronous page fault may be delivered while spinlock is held or current
> > process can't be preempted for other reasons. KVM uses preempt_count() to check if preemptions is allowed and schedule other process if possible. This works
> > with preemptable kernels since they maintain accurate information about
> > preemptability in preempt_count. This patch make non-preemptable kernel
> > maintain accurate information in preempt_count too.
>
> I'm thinking you're going to have to convince some people this won't
> slow them down for no good.
>
I saw old discussions about this in mailing list archives. Usually
someone wanted to use in_atomic() in driver code and this, of course,
caused the resistant. In this case, I think, the use is legitimate.

> Personally I always have PREEMPT=y, but other people seem to feel
> strongly about not doing so.
>
It is possible to add one more config option to enable reliable
preempt_count() without enabling preemption or make async pf be
dependable on PREEMPT=y. Don't like both of this options especially first
one. There are more then enough options already.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


cl at linux-foundation

Nov 23, 2009, 9:30 AM

Post #4 of 12 (319 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

This adds significant overhead for the !PREEMPT case adding lots of code
in critical paths all over the place.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gleb at redhat

Nov 23, 2009, 11:12 PM

Post #5 of 12 (323 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Mon, Nov 23, 2009 at 11:30:02AM -0600, Christoph Lameter wrote:
> This adds significant overhead for the !PREEMPT case adding lots of code
> in critical paths all over the place.
>
>
I want to measure it. Can you suggest benchmarks to try?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


cl at linux-foundation

Nov 24, 2009, 7:14 AM

Post #6 of 12 (322 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Tue, 24 Nov 2009, Gleb Natapov wrote:

> On Mon, Nov 23, 2009 at 11:30:02AM -0600, Christoph Lameter wrote:
> > This adds significant overhead for the !PREEMPT case adding lots of code
> > in critical paths all over the place.
> I want to measure it. Can you suggest benchmarks to try?

AIM9 (reaim9)?

Any test suite will do that tests OS performance.

Latency will also be negatively impacted. There are already significant
regressions in recent kernel releases so many of us who are sensitive
to these issues just stick with old kernels (2.6.22 f.e.) and hope
that the upstream issues are worked out at some point.

There is also lldiag package in my directory. See

http://www.kernel.org/pub/linux/kernel/people/christoph/lldiag

Try the latency test and the mcast test. Localhost multicast is typically
a good test for kernel performance.

There is also the page fault test that Kamezawa-san posted recently in the
thread where we tried to deal with the long term mmap_sem issues.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gleb at redhat

Nov 30, 2009, 2:56 AM

Post #7 of 12 (305 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Tue, Nov 24, 2009 at 09:14:03AM -0600, Christoph Lameter wrote:
> On Tue, 24 Nov 2009, Gleb Natapov wrote:
>
> > On Mon, Nov 23, 2009 at 11:30:02AM -0600, Christoph Lameter wrote:
> > > This adds significant overhead for the !PREEMPT case adding lots of code
> > > in critical paths all over the place.
> > I want to measure it. Can you suggest benchmarks to try?
>
> AIM9 (reaim9)?
Below are results for kernel 2.6.32-rc8 with and without the patch (only
this single patch is applied).

test name with (stddev) without (stddev)
===========================================================================
jmp_test 57853.762 ( 1086.51) 55664.287 ( 5152.14) 3.93%
stream_pipe 10286.967 ( 132.01) 11396.327 ( 306.01) -9.73%
new_raph 12573.395 ( 2.64) 12535.764 ( 85.14) 0.30%
sync_disk_rw 0.100 ( 0.00) 0.100 ( 0.00) -0.44%
udp_test 4008.058 ( 37.57) 3774.514 ( 22.03) 6.19%
add_long 68.542 ( 0.00) 68.530 ( 0.01) 0.02%
exec_test 181.615 ( 0.46) 184.503 ( 0.42) -1.57%
div_double 114.209 ( 0.02) 114.230 ( 0.03) -0.02%
mem_rtns_1 283.733 ( 3.27) 285.936 ( 2.24) -0.77%
sync_disk_cp 0.043 ( 0.00) 0.043 ( 0.00) 0.03%
fun_cal2 780.701 ( 0.16) 780.867 ( 0.07) -0.02%
matrix_rtns 70160.568 ( 28.58) 70181.900 ( 16.46) -0.03%
fun_cal1 780.701 ( 0.16) 780.763 ( 0.13) -0.01%
div_int 219.216 ( 0.03) 219.264 ( 0.04) -0.02%
pipe_cpy 16239.120 ( 468.99) 16727.067 ( 280.27) -2.92%
fifo_test 12864.276 ( 242.82) 13383.616 ( 199.31) -3.88%
sync_disk_wrt 0.043 ( 0.00) 0.043 ( 0.00) -0.11%
mul_long 4276.703 ( 0.79) 4277.528 ( 0.65) -0.02%
num_rtns_1 4308.165 ( 5.99) 4306.133 ( 5.84) 0.05%
disk_src 1507.993 ( 8.04) 1586.100 ( 5.44) -4.92%
mul_short 3422.840 ( 0.31) 3423.280 ( 0.24) -0.01%
series_1 121706.708 ( 266.62) 121356.355 ( 982.04) 0.29%
mul_int 4277.353 ( 0.45) 4277.953 ( 0.34) -0.01%
mul_float 99.947 ( 0.02) 99.947 ( 0.02) -0.00%
link_test 2319.090 ( 12.51) 2466.564 ( 1.52) -5.98%
fun_cal15 380.836 ( 0.06) 380.876 ( 0.10) -0.01%
trig_rtns 163.416 ( 0.13) 163.185 ( 0.51) 0.14%
fun_cal 915.226 ( 4.56) 902.033 ( 1.44) 1.46%
misc_rtns_1 4285.322 ( 18.72) 4282.907 ( 27.07) 0.06%
brk_test 221.167 ( 8.98) 230.345 ( 7.98) -3.98%
add_float 133.242 ( 0.02) 133.249 ( 0.02) -0.01%
page_test 284.488 ( 3.71) 284.180 ( 13.91) 0.11%
div_long 85.364 ( 0.27) 85.222 ( 0.02) 0.17%
dir_rtns_1 207.953 ( 2.56) 212.532 ( 0.59) -2.15%
disk_cp 66.449 ( 0.43) 65.754 ( 0.61) 1.06%
sieve 23.538 ( 0.01) 23.599 ( 0.11) -0.26%
tcp_test 2085.428 ( 18.43) 2059.062 ( 5.52) 1.28%
disk_wrt 81.839 ( 0.16) 82.652 ( 0.41) -0.98%
mul_double 79.951 ( 0.01) 79.961 ( 0.02) -0.01%
fork_test 57.408 ( 0.43) 57.835 ( 0.27) -0.74%
add_short 171.326 ( 0.03) 171.314 ( 0.01) 0.01%
creat-clo 395.995 ( 3.63) 403.918 ( 2.74) -1.96%
sort_rtns_1 276.833 ( 31.80) 290.855 ( 0.46) -4.82%
add_int 79.961 ( 0.02) 79.967 ( 0.00) -0.01%
disk_rr 67.635 ( 0.23) 68.282 ( 0.59) -0.95%
div_short 210.318 ( 0.04) 210.365 ( 0.05) -0.02%
disk_rw 57.041 ( 0.26) 57.470 ( 0.31) -0.75%
dgram_pipe 10088.191 ( 86.81) 9848.119 ( 406.33) 2.44%
shell_rtns_3 681.882 ( 3.30) 693.734 ( 2.67) -1.71%
shell_rtns_2 681.721 ( 3.24) 693.307 ( 2.90) -1.67%
shell_rtns_1 681.116 ( 3.46) 692.302 ( 3.16) -1.62%
div_float 114.224 ( 0.02) 114.230 ( 0.00) -0.01%
ram_copy 217812.436 ( 615.62) 218160.548 ( 135.66) -0.16%
shared_memory 11022.611 ( 20.75) 10870.031 ( 61.44) 1.40%
signal_test 700.907 ( 1.42) 711.253 ( 0.49) -1.46%
add_double 88.836 ( 0.00) 88.837 ( 0.00) -0.00%
array_rtns 119.369 ( 0.06) 119.182 ( 0.36) 0.16%
string_rtns 97.107 ( 0.21) 97.160 ( 0.22) -0.05%
disk_rd 626.890 ( 18.25) 586.034 ( 5.58) 6.97%

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gleb at redhat

Nov 30, 2009, 2:58 AM

Post #8 of 12 (304 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Mon, Nov 30, 2009 at 12:56:12PM +0200, Gleb Natapov wrote:
> On Tue, Nov 24, 2009 at 09:14:03AM -0600, Christoph Lameter wrote:
> > On Tue, 24 Nov 2009, Gleb Natapov wrote:
> >
> > > On Mon, Nov 23, 2009 at 11:30:02AM -0600, Christoph Lameter wrote:
> > > > This adds significant overhead for the !PREEMPT case adding lots of code
> > > > in critical paths all over the place.
> > > I want to measure it. Can you suggest benchmarks to try?
> >
> > AIM9 (reaim9)?
> Below are results for kernel 2.6.32-rc8 with and without the patch (only
> this single patch is applied).
>
Forgot to tell. The results are average between 5 different runs.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Nov 30, 2009, 2:59 AM

Post #9 of 12 (301 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Mon, 2009-11-30 at 12:58 +0200, Gleb Natapov wrote:
> On Mon, Nov 30, 2009 at 12:56:12PM +0200, Gleb Natapov wrote:
> > On Tue, Nov 24, 2009 at 09:14:03AM -0600, Christoph Lameter wrote:
> > > On Tue, 24 Nov 2009, Gleb Natapov wrote:
> > >
> > > > On Mon, Nov 23, 2009 at 11:30:02AM -0600, Christoph Lameter wrote:
> > > > > This adds significant overhead for the !PREEMPT case adding lots of code
> > > > > in critical paths all over the place.
> > > > I want to measure it. Can you suggest benchmarks to try?
> > >
> > > AIM9 (reaim9)?
> > Below are results for kernel 2.6.32-rc8 with and without the patch (only
> > this single patch is applied).
> >
> Forgot to tell. The results are average between 5 different runs.

Would be good to also report the variance over those 5 runs, allows us
to see if the difference is within the noise.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


avi at redhat

Nov 30, 2009, 3:01 AM

Post #10 of 12 (305 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On 11/30/2009 12:59 PM, Peter Zijlstra wrote:
>> Forgot to tell. The results are average between 5 different runs.
>>
> Would be good to also report the variance over those 5 runs, allows us
> to see if the difference is within the noise.
>

That's the stddev column.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


peterz at infradead

Nov 30, 2009, 3:05 AM

Post #11 of 12 (305 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

On Mon, 2009-11-30 at 11:59 +0100, Peter Zijlstra wrote:
> On Mon, 2009-11-30 at 12:58 +0200, Gleb Natapov wrote:
> > On Mon, Nov 30, 2009 at 12:56:12PM +0200, Gleb Natapov wrote:
> > > On Tue, Nov 24, 2009 at 09:14:03AM -0600, Christoph Lameter wrote:
> > > > On Tue, 24 Nov 2009, Gleb Natapov wrote:
> > > >
> > > > > On Mon, Nov 23, 2009 at 11:30:02AM -0600, Christoph Lameter wrote:
> > > > > > This adds significant overhead for the !PREEMPT case adding lots of code
> > > > > > in critical paths all over the place.
> > > > > I want to measure it. Can you suggest benchmarks to try?
> > > >
> > > > AIM9 (reaim9)?
> > > Below are results for kernel 2.6.32-rc8 with and without the patch (only
> > > this single patch is applied).
> > >
> > Forgot to tell. The results are average between 5 different runs.
>
> Would be good to also report the variance over those 5 runs, allows us
> to see if the difference is within the noise.

Got pointed to the fact that there is a stddev column right there.

Must be Monday or something ;-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


cl at linux-foundation

Nov 30, 2009, 8:23 AM

Post #12 of 12 (301 views)
Permalink
Re: [PATCH v2 10/12] Maintain preemptability count even for !CONFIG_PREEMPT kernels [In reply to]

Ok so there is some variance in tests as usual due to cacheline placement.
But it seems that overall we are looking at a 1-2% regression.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.