Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH] cgroup: memory.force_empty can make system slowdown

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


m-ikeda at ds

Aug 15, 2008, 7:26 PM

Post #1 of 3 (289 views)
Permalink
[PATCH] cgroup: memory.force_empty can make system slowdown

Cgroup's memory controller has a control file "memory.force_empty"
to reset usage account charged to a cgroup. The account shouldn't
be reset if one or more processes are attached to the cgroup (at
least for memory controller, IMHO). So mem_cgroup_force_empty()
is implemented to return -EBUSY and do nothing if so.
However, cgroup on hierarchy root faultily might be a exception.
Even if processes are attached to root cgroup (which is a "default"
cgroup for processes), forcing-empty can run by writing something to
memory.force_empty and it'll never end.

Following patch prevents this issue.

This patch is for cgroup infrastructure code. The issue can be
measured by modifying memory controller code also, namely to change
mem_cgroup_force_empty() to see CSS_ROOT bit of css->flags.
I believe cgroup->count approach like the patch below is rather
generic and reasonable, how does that sound?

Paul, Balbir?



Signed-off-by: Munehiro "Muuhh" Ikeda <m-ikeda [at] ds>

diff -uNrp linux-2.6.27-rc3.orig/kernel/cgroup.c linux-2.6.27-rc3/kernel/cgroup.c
--- linux-2.6.27-rc3.orig/kernel/cgroup.c 2008-08-12 21:55:39.000000000 -0400
+++ linux-2.6.27-rc3/kernel/cgroup.c 2008-08-15 20:52:52.000000000 -0400
@@ -2264,8 +2264,10 @@ static void init_cgroup_css(struct cgrou
css->cgroup = cgrp;
atomic_set(&css->refcnt, 0);
css->flags = 0;
- if (cgrp == dummytop)
+ if (cgrp == dummytop) {
set_bit(CSS_ROOT, &css->flags);
+ atomic_set(&css->cgroup->count, 1);
+ }
BUG_ON(cgrp->subsys[ss->subsys_id]);
cgrp->subsys[ss->subsys_id] = css;
}



--
IKEDA, Munehiro
NEC Corporation of America
m-ikeda [at] ds

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


lizf at cn

Aug 16, 2008, 6:15 PM

Post #2 of 3 (256 views)
Permalink
Re: [PATCH] cgroup: memory.force_empty can make system slowdown [In reply to]

IKEDA, Munehiro wrote:
> Cgroup's memory controller has a control file "memory.force_empty"
> to reset usage account charged to a cgroup. The account shouldn't
> be reset if one or more processes are attached to the cgroup (at
> least for memory controller, IMHO). So mem_cgroup_force_empty()
> is implemented to return -EBUSY and do nothing if so.
> However, cgroup on hierarchy root faultily might be a exception.
> Even if processes are attached to root cgroup (which is a "default"
> cgroup for processes), forcing-empty can run by writing something to
> memory.force_empty and it'll never end.
>

I found this bug last week, and I've made patches to fix it, but then
I was on vacation. I'll send the patches out soon.

> Following patch prevents this issue.
>
> This patch is for cgroup infrastructure code. The issue can be
> measured by modifying memory controller code also, namely to change
> mem_cgroup_force_empty() to see CSS_ROOT bit of css->flags.
> I believe cgroup->count approach like the patch below is rather
> generic and reasonable, how does that sound?
>

It's ok for the top_group's count to be 0 due to the top_cgroup hack.
With this patch, the top cgroup's count will be always >0, even if it
has no tasks in it, so writing to top_cgroup's force_empty will always
return -EBUSY.

> Paul, Balbir?
>
>
>
> Signed-off-by: Munehiro "Muuhh" Ikeda <m-ikeda [at] ds>
>
> diff -uNrp linux-2.6.27-rc3.orig/kernel/cgroup.c linux-2.6.27-rc3/kernel/cgroup.c
> --- linux-2.6.27-rc3.orig/kernel/cgroup.c 2008-08-12 21:55:39.000000000 -0400
> +++ linux-2.6.27-rc3/kernel/cgroup.c 2008-08-15 20:52:52.000000000 -0400
> @@ -2264,8 +2264,10 @@ static void init_cgroup_css(struct cgrou
> css->cgroup = cgrp;
> atomic_set(&css->refcnt, 0);
> css->flags = 0;
> - if (cgrp == dummytop)
> + if (cgrp == dummytop) {
> set_bit(CSS_ROOT, &css->flags);
> + atomic_set(&css->cgroup->count, 1);
> + }
> BUG_ON(cgrp->subsys[ss->subsys_id]);
> cgrp->subsys[ss->subsys_id] = css;
> }
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


lizf at cn

Aug 16, 2008, 8:10 PM

Post #3 of 3 (266 views)
Permalink
Re: [PATCH] cgroup: memory.force_empty can make system slowdown [In reply to]

Li Zefan wrote:
> IKEDA, Munehiro wrote:
>> Cgroup's memory controller has a control file "memory.force_empty"
>> to reset usage account charged to a cgroup. The account shouldn't
>> be reset if one or more processes are attached to the cgroup (at
>> least for memory controller, IMHO). So mem_cgroup_force_empty()
>> is implemented to return -EBUSY and do nothing if so.
>> However, cgroup on hierarchy root faultily might be a exception.
>> Even if processes are attached to root cgroup (which is a "default"
>> cgroup for processes), forcing-empty can run by writing something to
>> memory.force_empty and it'll never end.
>>
>
> I found this bug last week, and I've made patches to fix it, but then
> I was on vacation. I'll send the patches out soon.
>
>> Following patch prevents this issue.
>>
>> This patch is for cgroup infrastructure code. The issue can be
>> measured by modifying memory controller code also, namely to change
>> mem_cgroup_force_empty() to see CSS_ROOT bit of css->flags.
>> I believe cgroup->count approach like the patch below is rather
>> generic and reasonable, how does that sound?
>>
>
> It's ok for the top_group's count to be 0 due to the top_cgroup hack.
> With this patch, the top cgroup's count will be always >0, even if it
> has no tasks in it, so writing to top_cgroup's force_empty will always
> return -EBUSY.
>

I thought cgrp->css_sets will be empty when there are no tasks in the top cgroup,
but I was wrong, because init_css_set's refcount will always >0,
so cgroup_task_count() won't return 0 for the top cgroup:

# mount -t cgroup -o debug xxx /mnt
# mkdir /mnt/sub
# for pid in `cat /mnt/tasks`; do echo $pid > /mnt/sub/tasks; done
# cat /mnt/tasks
# cat /mnt/debug.taskcount
3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.