Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH 3/3] vmevent: Implement special low-memory attribute

 

 

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded


anton.vorontsov at linaro

May 1, 2012, 6:26 AM

Post #1 of 26 (295 views)
Permalink
[PATCH 3/3] vmevent: Implement special low-memory attribute

This is specially "blended" attribute, the event triggers when kernel
decides that we're close to the low memory threshold. Userspace should
not expect very precise meaning of low memory situation, mostly, it's
just a guess on the kernel's side.

Well, this is the same as userland should not know or care how exactly
kernel manages the memory, or assume that memory management behaviour
is a part of the "ABI". So, all the 'low memory' is just guessing, but
we're trying to do our best. It might be that we will end up with two
or three variations of 'low memory' thresholds, and all of them would
be useful for different use cases.

For this implementation, we assume that there's a low memory situation
for the N pages threshold when we have neither N pages of completely
free pages, nor we have N reclaimable pages in the cache. This
effectively means, that if userland expects to allocate N pages, it
would consume all the free pages, and any further allocations (above
N) would start draining caches.

In the worst case, prior to hitting the threshold, we might have only
N pages in cache, and nearly no memory as free pages.

The same 'low memory' meaning is used in the current Android Low
Memory Killer driver.

Signed-off-by: Anton Vorontsov <anton.vorontsov [at] linaro>
---
include/linux/vmevent.h | 7 ++++++
mm/vmevent.c | 40 ++++++++++++++++++++++++++++++++++
tools/testing/vmevent/vmevent-test.c | 12 +++++++++-
3 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/include/linux/vmevent.h b/include/linux/vmevent.h
index aae0d24..9bfa244 100644
--- a/include/linux/vmevent.h
+++ b/include/linux/vmevent.h
@@ -10,6 +10,13 @@ enum {
VMEVENT_ATTR_NR_AVAIL_PAGES = 1UL,
VMEVENT_ATTR_NR_FREE_PAGES = 2UL,
VMEVENT_ATTR_NR_SWAP_PAGES = 3UL,
+ /*
+ * This is specially blended attribute, the event triggers
+ * when kernel decides that we're close to the low memory threshold.
+ * Don't expect very precise meaning of low memory situation, mostly,
+ * it's just a guess on the kernel's side.
+ */
+ VMEVENT_ATTR_LOWMEM_PAGES = 4UL,

VMEVENT_ATTR_MAX /* non-ABI */
};
diff --git a/mm/vmevent.c b/mm/vmevent.c
index b312236..d278a25 100644
--- a/mm/vmevent.c
+++ b/mm/vmevent.c
@@ -68,10 +68,50 @@ static u64 vmevent_attr_avail_pages(struct vmevent_watch *watch,
return totalram_pages;
}

+/*
+ * Here's some implementation details for the "low memory" meaning.
+ *
+ * (The explanation is not in the header file as userland should not
+ * know these details, nor it should assume that the meaning will
+ * always be the same. As well as it should not know how exactly kernel
+ * manages the memory, or assume that memory management behaviour is a
+ * part of the "ABI". So, all the 'low memory' is just guessing, but
+ * we're trying to do our best.)
+ *
+ * For this implementation, we assume that there's a low memory situation
+ * for the N pages threshold when we have neither N pages of completely
+ * free pages, nor we have N reclaimable pages in the cache. This
+ * effectively means, that if userland expects to allocate N pages, it
+ * would consume all the free pages, and any further allocations (above
+ * N) would start draining caches.
+ *
+ * In the worst case, prior hitting the threshold, we might have only
+ * N pages in cache, and nearly no memory as free pages.
+ */
+static u64 vmevent_attr_lowmem_pages(struct vmevent_watch *watch,
+ struct vmevent_attr *attr)
+{
+ int free = global_page_state(NR_FREE_PAGES);
+ int file = global_page_state(NR_FILE_PAGES) -
+ global_page_state(NR_SHMEM); /* TODO: account locked pages */
+ int val = attr->value;
+
+ /*
+ * For convenience we return 0 or attr value (instead of 0/1), it
+ * makes it easier for vmevent_match() to cope with blended
+ * attributes, plus userland might use the value to find out which
+ * threshold triggered.
+ */
+ if (free < val && file < val)
+ return val;
+ return 0;
+}
+
static vmevent_attr_sample_fn attr_samplers[] = {
[VMEVENT_ATTR_NR_AVAIL_PAGES] = vmevent_attr_avail_pages,
[VMEVENT_ATTR_NR_FREE_PAGES] = vmevent_attr_free_pages,
[VMEVENT_ATTR_NR_SWAP_PAGES] = vmevent_attr_swap_pages,
+ [VMEVENT_ATTR_LOWMEM_PAGES] = vmevent_attr_lowmem_pages,
};

static u64 vmevent_sample_attr(struct vmevent_watch *watch, struct vmevent_attr *attr)
diff --git a/tools/testing/vmevent/vmevent-test.c b/tools/testing/vmevent/vmevent-test.c
index fd9a174..c61aed7 100644
--- a/tools/testing/vmevent/vmevent-test.c
+++ b/tools/testing/vmevent/vmevent-test.c
@@ -33,7 +33,7 @@ int main(int argc, char *argv[])

config = (struct vmevent_config) {
.sample_period_ns = 1000000000L,
- .counter = 6,
+ .counter = 7,
.attrs = {
{
.type = VMEVENT_ATTR_NR_FREE_PAGES,
@@ -59,6 +59,13 @@ int main(int argc, char *argv[])
.type = VMEVENT_ATTR_NR_SWAP_PAGES,
},
{
+ .type = VMEVENT_ATTR_LOWMEM_PAGES,
+ .state = VMEVENT_ATTR_STATE_VALUE_LT |
+ VMEVENT_ATTR_STATE_VALUE_EQ |
+ VMEVENT_ATTR_STATE_ONE_SHOT,
+ .value = phys_pages / 2,
+ },
+ {
.type = 0xffff, /* invalid */
},
},
@@ -108,6 +115,9 @@ int main(int argc, char *argv[])
case VMEVENT_ATTR_NR_SWAP_PAGES:
printf(" VMEVENT_ATTR_NR_SWAP_PAGES: %Lu\n", attr->value);
break;
+ case VMEVENT_ATTR_LOWMEM_PAGES:
+ printf(" VMEVENT_ATTR_LOWMEM_PAGES: %Lu\n", attr->value);
+ break;
default:
printf(" Unknown attribute: %Lu\n", attr->value);
}
--
1.7.9.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 3, 2012, 3:33 AM

Post #2 of 26 (293 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 1, 2012 at 4:26 PM, Anton Vorontsov
<anton.vorontsov [at] linaro> wrote:
> This is specially "blended" attribute, the event triggers when kernel
> decides that we're close to the low memory threshold. Userspace should
> not expect very precise meaning of low memory situation, mostly, it's
> just a guess on the kernel's side.
>
> Well, this is the same as userland should not know or care how exactly
> kernel manages the memory, or assume that memory management behaviour
> is a part of the "ABI". So, all the 'low memory' is just guessing, but
> we're trying to do our best. It might be that we will end up with two
> or three variations of 'low memory' thresholds, and all of them would
> be useful for different use cases.
>
> For this implementation, we assume that there's a low memory situation
> for the N pages threshold when we have neither N pages of completely
> free pages, nor we have N reclaimable pages in the cache. This
> effectively means, that if userland expects to allocate N pages, it
> would consume all the free pages, and any further allocations (above
> N) would start draining caches.
>
> In the worst case, prior to hitting the threshold, we might have only
> N pages in cache, and nearly no memory as free pages.
>
> The same 'low memory' meaning is used in the current Android Low
> Memory Killer driver.
>
> Signed-off-by: Anton Vorontsov <anton.vorontsov [at] linaro>

I don't see why we couldn't add this. Minchan, thoughts?

> ---
>  include/linux/vmevent.h              |    7 ++++++
>  mm/vmevent.c                         |   40 ++++++++++++++++++++++++++++++++++
>  tools/testing/vmevent/vmevent-test.c |   12 +++++++++-
>  3 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/vmevent.h b/include/linux/vmevent.h
> index aae0d24..9bfa244 100644
> --- a/include/linux/vmevent.h
> +++ b/include/linux/vmevent.h
> @@ -10,6 +10,13 @@ enum {
>        VMEVENT_ATTR_NR_AVAIL_PAGES     = 1UL,
>        VMEVENT_ATTR_NR_FREE_PAGES      = 2UL,
>        VMEVENT_ATTR_NR_SWAP_PAGES      = 3UL,
> +       /*
> +        * This is specially blended attribute, the event triggers
> +        * when kernel decides that we're close to the low memory threshold.
> +        * Don't expect very precise meaning of low memory situation, mostly,
> +        * it's just a guess on the kernel's side.
> +        */
> +       VMEVENT_ATTR_LOWMEM_PAGES       = 4UL,
>
>        VMEVENT_ATTR_MAX                /* non-ABI */
>  };
> diff --git a/mm/vmevent.c b/mm/vmevent.c
> index b312236..d278a25 100644
> --- a/mm/vmevent.c
> +++ b/mm/vmevent.c
> @@ -68,10 +68,50 @@ static u64 vmevent_attr_avail_pages(struct vmevent_watch *watch,
>        return totalram_pages;
>  }
>
> +/*
> + * Here's some implementation details for the "low memory" meaning.
> + *
> + * (The explanation is not in the header file as userland should not
> + * know these details, nor it should assume that the meaning will
> + * always be the same. As well as it should not know how exactly kernel
> + * manages the memory, or assume that memory management behaviour is a
> + * part of the "ABI". So, all the 'low memory' is just guessing, but
> + * we're trying to do our best.)
> + *
> + * For this implementation, we assume that there's a low memory situation
> + * for the N pages threshold when we have neither N pages of completely
> + * free pages, nor we have N reclaimable pages in the cache. This
> + * effectively means, that if userland expects to allocate N pages, it
> + * would consume all the free pages, and any further allocations (above
> + * N) would start draining caches.
> + *
> + * In the worst case, prior hitting the threshold, we might have only
> + * N pages in cache, and nearly no memory as free pages.
> + */
> +static u64 vmevent_attr_lowmem_pages(struct vmevent_watch *watch,
> +                                    struct vmevent_attr *attr)
> +{
> +       int free = global_page_state(NR_FREE_PAGES);
> +       int file = global_page_state(NR_FILE_PAGES) -
> +                  global_page_state(NR_SHMEM); /* TODO: account locked pages */
> +       int val = attr->value;
> +
> +       /*
> +        * For convenience we return 0 or attr value (instead of 0/1), it
> +        * makes it easier for vmevent_match() to cope with blended
> +        * attributes, plus userland might use the value to find out which
> +        * threshold triggered.
> +        */
> +       if (free < val && file < val)
> +               return val;
> +       return 0;
> +}
> +
>  static vmevent_attr_sample_fn attr_samplers[] = {
>        [VMEVENT_ATTR_NR_AVAIL_PAGES]   = vmevent_attr_avail_pages,
>        [VMEVENT_ATTR_NR_FREE_PAGES]    = vmevent_attr_free_pages,
>        [VMEVENT_ATTR_NR_SWAP_PAGES]    = vmevent_attr_swap_pages,
> +       [VMEVENT_ATTR_LOWMEM_PAGES]     = vmevent_attr_lowmem_pages,
>  };
>
>  static u64 vmevent_sample_attr(struct vmevent_watch *watch, struct vmevent_attr *attr)
> diff --git a/tools/testing/vmevent/vmevent-test.c b/tools/testing/vmevent/vmevent-test.c
> index fd9a174..c61aed7 100644
> --- a/tools/testing/vmevent/vmevent-test.c
> +++ b/tools/testing/vmevent/vmevent-test.c
> @@ -33,7 +33,7 @@ int main(int argc, char *argv[])
>
>        config = (struct vmevent_config) {
>                .sample_period_ns       = 1000000000L,
> -               .counter                = 6,
> +               .counter                = 7,
>                .attrs                  = {
>                        {
>                                .type   = VMEVENT_ATTR_NR_FREE_PAGES,
> @@ -59,6 +59,13 @@ int main(int argc, char *argv[])
>                                .type   = VMEVENT_ATTR_NR_SWAP_PAGES,
>                        },
>                        {
> +                               .type   = VMEVENT_ATTR_LOWMEM_PAGES,
> +                               .state  = VMEVENT_ATTR_STATE_VALUE_LT |
> +                                         VMEVENT_ATTR_STATE_VALUE_EQ |
> +                                         VMEVENT_ATTR_STATE_ONE_SHOT,
> +                               .value  = phys_pages / 2,
> +                       },
> +                       {
>                                .type   = 0xffff, /* invalid */
>                        },
>                },
> @@ -108,6 +115,9 @@ int main(int argc, char *argv[])
>                        case VMEVENT_ATTR_NR_SWAP_PAGES:
>                                printf("  VMEVENT_ATTR_NR_SWAP_PAGES: %Lu\n", attr->value);
>                                break;
> +                       case VMEVENT_ATTR_LOWMEM_PAGES:
> +                               printf("  VMEVENT_ATTR_LOWMEM_PAGES: %Lu\n", attr->value);
> +                               break;
>                        default:
>                                printf("  Unknown attribute: %Lu\n", attr->value);
>                        }
> --
> 1.7.9.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo [at] kvack  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

May 3, 2012, 9:26 PM

Post #3 of 26 (290 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On 05/01/2012 10:26 PM, Anton Vorontsov wrote:

> This is specially "blended" attribute, the event triggers when kernel
> decides that we're close to the low memory threshold. Userspace should
> not expect very precise meaning of low memory situation, mostly, it's
> just a guess on the kernel's side.
>
> Well, this is the same as userland should not know or care how exactly
> kernel manages the memory, or assume that memory management behaviour
> is a part of the "ABI". So, all the 'low memory' is just guessing, but
> we're trying to do our best. It might be that we will end up with two
> or three variations of 'low memory' thresholds, and all of them would


First of all, your calculation for available pages is very simple and
it's very specific of recent mobile phone.
But recent systems have various devices.
For example,

SSD : very fast server SSD which has lots of internal ram so that write is very fast.
thumb usb : very slow whihc has small ram

1) We can consider anon pages and dirty pages as available pages.

SSD thumb usb
rootfs 0
swap 0

2) We can consider anon pages as available pages but dirty page doesn't

SSD thumb usb
rootfs O
swap 0

3) We can consider dirty pages as available pages but anon doesn't

SSD thumb usb
rootfs O
swap O

4) We can't consider dirty pages and anon pages as available pages

SSD thumb usb
rootfs 0
swap 0

5) If we use zram as swap?
6) Another idea. If we use both zram and swap device(eMMC), then when zram is full,
we writes zram pages into swap device with align cluster size?

I mean we can select various option to define low memory state.

> be useful for different use cases.


Why should we do it in kernel side?
If vmevent will have VMEVENT_ATTR_[FILE|MOCK|DIRTY|WRITEBACK|SHMEM|ANON|SWAP]_PAGES
and so on which is needed by calculation, we can calculate it in userspace without
forking /proc/vmstat to see it. So I think there is no problem to do it in userspace.

And even though we can solve above problem, it is possible to show up another new "blended" attribute
in future and it will suffer same problem, again. So IMHO, let's leave vmevent as it is which is
very raw attribute and let's do blended attribute in user space.

>
> For this implementation, we assume that there's a low memory situation
> for the N pages threshold when we have neither N pages of completely
> free pages, nor we have N reclaimable pages in the cache. This
> effectively means, that if userland expects to allocate N pages, it
> would consume all the free pages, and any further allocations (above
> N) would start draining caches.
>
> In the worst case, prior to hitting the threshold, we might have only
> N pages in cache, and nearly no memory as free pages.
>
> The same 'low memory' meaning is used in the current Android Low
> Memory Killer driver.
>
> Signed-off-by: Anton Vorontsov <anton.vorontsov [at] linaro>
> ---
> include/linux/vmevent.h | 7 ++++++
> mm/vmevent.c | 40 ++++++++++++++++++++++++++++++++++
> tools/testing/vmevent/vmevent-test.c | 12 +++++++++-
> 3 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/vmevent.h b/include/linux/vmevent.h
> index aae0d24..9bfa244 100644
> --- a/include/linux/vmevent.h
> +++ b/include/linux/vmevent.h
> @@ -10,6 +10,13 @@ enum {
> VMEVENT_ATTR_NR_AVAIL_PAGES = 1UL,
> VMEVENT_ATTR_NR_FREE_PAGES = 2UL,
> VMEVENT_ATTR_NR_SWAP_PAGES = 3UL,
> + /*
> + * This is specially blended attribute, the event triggers
> + * when kernel decides that we're close to the low memory threshold.
> + * Don't expect very precise meaning of low memory situation, mostly,
> + * it's just a guess on the kernel's side.
> + */
> + VMEVENT_ATTR_LOWMEM_PAGES = 4UL,
>
> VMEVENT_ATTR_MAX /* non-ABI */
> };
> diff --git a/mm/vmevent.c b/mm/vmevent.c
> index b312236..d278a25 100644
> --- a/mm/vmevent.c
> +++ b/mm/vmevent.c
> @@ -68,10 +68,50 @@ static u64 vmevent_attr_avail_pages(struct vmevent_watch *watch,
> return totalram_pages;
> }
>
> +/*
> + * Here's some implementation details for the "low memory" meaning.
> + *
> + * (The explanation is not in the header file as userland should not
> + * know these details, nor it should assume that the meaning will
> + * always be the same. As well as it should not know how exactly kernel
> + * manages the memory, or assume that memory management behaviour is a
> + * part of the "ABI". So, all the 'low memory' is just guessing, but
> + * we're trying to do our best.)
> + *
> + * For this implementation, we assume that there's a low memory situation
> + * for the N pages threshold when we have neither N pages of completely
> + * free pages, nor we have N reclaimable pages in the cache. This
> + * effectively means, that if userland expects to allocate N pages, it
> + * would consume all the free pages, and any further allocations (above
> + * N) would start draining caches.
> + *
> + * In the worst case, prior hitting the threshold, we might have only
> + * N pages in cache, and nearly no memory as free pages.
> + */
> +static u64 vmevent_attr_lowmem_pages(struct vmevent_watch *watch,
> + struct vmevent_attr *attr)
> +{
> + int free = global_page_state(NR_FREE_PAGES);
> + int file = global_page_state(NR_FILE_PAGES) -
> + global_page_state(NR_SHMEM); /* TODO: account locked pages */
> + int val = attr->value;
> +
> + /*
> + * For convenience we return 0 or attr value (instead of 0/1), it
> + * makes it easier for vmevent_match() to cope with blended
> + * attributes, plus userland might use the value to find out which
> + * threshold triggered.
> + */
> + if (free < val && file < val)
> + return val;
> + return 0;
> +}
> +
> static vmevent_attr_sample_fn attr_samplers[] = {
> [VMEVENT_ATTR_NR_AVAIL_PAGES] = vmevent_attr_avail_pages,
> [VMEVENT_ATTR_NR_FREE_PAGES] = vmevent_attr_free_pages,
> [VMEVENT_ATTR_NR_SWAP_PAGES] = vmevent_attr_swap_pages,
> + [VMEVENT_ATTR_LOWMEM_PAGES] = vmevent_attr_lowmem_pages,
> };
>
> static u64 vmevent_sample_attr(struct vmevent_watch *watch, struct vmevent_attr *attr)
> diff --git a/tools/testing/vmevent/vmevent-test.c b/tools/testing/vmevent/vmevent-test.c
> index fd9a174..c61aed7 100644
> --- a/tools/testing/vmevent/vmevent-test.c
> +++ b/tools/testing/vmevent/vmevent-test.c
> @@ -33,7 +33,7 @@ int main(int argc, char *argv[])
>
> config = (struct vmevent_config) {
> .sample_period_ns = 1000000000L,
> - .counter = 6,
> + .counter = 7,
> .attrs = {
> {
> .type = VMEVENT_ATTR_NR_FREE_PAGES,
> @@ -59,6 +59,13 @@ int main(int argc, char *argv[])
> .type = VMEVENT_ATTR_NR_SWAP_PAGES,
> },
> {
> + .type = VMEVENT_ATTR_LOWMEM_PAGES,
> + .state = VMEVENT_ATTR_STATE_VALUE_LT |
> + VMEVENT_ATTR_STATE_VALUE_EQ |
> + VMEVENT_ATTR_STATE_ONE_SHOT,
> + .value = phys_pages / 2,
> + },
> + {
> .type = 0xffff, /* invalid */
> },
> },
> @@ -108,6 +115,9 @@ int main(int argc, char *argv[])
> case VMEVENT_ATTR_NR_SWAP_PAGES:
> printf(" VMEVENT_ATTR_NR_SWAP_PAGES: %Lu\n", attr->value);
> break;
> + case VMEVENT_ATTR_LOWMEM_PAGES:
> + printf(" VMEVENT_ATTR_LOWMEM_PAGES: %Lu\n", attr->value);
> + break;
> default:
> printf(" Unknown attribute: %Lu\n", attr->value);
> }



--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


anton.vorontsov at linaro

May 4, 2012, 12:38 AM

Post #4 of 26 (286 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Fri, May 04, 2012 at 01:26:45PM +0900, Minchan Kim wrote:
[...]
> > be useful for different use cases.
>
> Why should we do it in kernel side?

Because currently you can't do this in userland, see below. Today
this would be effectively the same as constantly reading /proc/vmstat,
which is surely not friendly performance/context switches/battery
wise.

> If vmevent will have VMEVENT_ATTR_[FILE|MOCK|DIRTY|WRITEBACK|SHMEM|ANON|SWAP]_PAGES
> and so on which is needed by calculation, we can calculate it in userspace without
> forking /proc/vmstat to see it. So I think there is no problem to do it in userspace.

There are two problems.

1. Originally, the idea behind vmevent was that we should not expose all
these mm details in vmevent, because it ties ABI with Linux internal
memory representation;

2. If you have say a boolean '(A + B + C + ...) > X' attribute (which is
exactly what blended attributes are), you can't just set up independent
thresholds on A, B, C, ... and have the same effect.

(What we can do, though, is... introduce arithmetic operators in
vmevent. :-D But then, at the end, we'll probably implement in-kernel
forth-like stack machine, with vmevent_config array serving as a
sequence of op-codes. ;-)

If we'll give up on "1." (Pekka, ping), then we need to solve "2."
in a sane way: we'll have to add a 'NR_FILE_PAGES - NR_SHMEM -
<todo-locked-file-pages>' attribute, and give it a name.

RECLAIMABLE_CACHE_PAGES maybe?

Thanks!

--
Anton Vorontsov
Email: cbouatmailru [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 7, 2012, 12:14 AM

Post #5 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Fri, May 4, 2012 at 10:38 AM, Anton Vorontsov
<anton.vorontsov [at] linaro> wrote:
> There are two problems.
>
> 1. Originally, the idea behind vmevent was that we should not expose all
>   these mm details in vmevent, because it ties ABI with Linux internal
>   memory representation;
>
> 2. If you have say a boolean '(A + B + C + ...) > X' attribute (which is
>   exactly what blended attributes are), you can't just set up independent
>   thresholds on A, B, C, ... and have the same effect.
>
>   (What we can do, though, is... introduce arithmetic operators in
>   vmevent. :-D But then, at the end, we'll probably implement in-kernel
>   forth-like stack machine, with vmevent_config array serving as a
>   sequence of op-codes. ;-)
>
> If we'll give up on "1." (Pekka, ping), then we need to solve "2."
> in a sane way: we'll have to add a 'NR_FILE_PAGES - NR_SHMEM -
> <todo-locked-file-pages>' attribute, and give it a name.

Well, no, we can't give up on (1) completely. That'd mean that
eventually we'd need to change the ABI and break userspace. The
difference between exposing internal details and reasonable
abstractions is by no means black and white.

AFAICT, RECLAIMABLE_CACHE_PAGES is a reasonable thing to support. Can
anyone come up with a reason why we couldn't do that in the future?

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kosaki.motohiro at gmail

May 7, 2012, 1:26 AM

Post #6 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

>> If we'll give up on "1." (Pekka, ping), then we need to solve "2."
>> in a sane way: we'll have to add a 'NR_FILE_PAGES - NR_SHMEM -
>> <todo-locked-file-pages>' attribute, and give it a name.
>
> Well, no, we can't give up on (1) completely. That'd mean that
> eventually we'd need to change the ABI and break userspace. The
> difference between exposing internal details and reasonable
> abstractions is by no means black and white.
>
> AFAICT, RECLAIMABLE_CACHE_PAGES is a reasonable thing to support. Can
> anyone come up with a reason why we couldn't do that in the future?

It can. but the problem is, that is completely useless. Because of, 1)
dirty pages writing-out
is sometimes very slow and 2) libc and some important library's pages
are critical important
for running a system even though it is clean and reclaimable. In other
word, kernel don't have
an info then can't expose it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


anton.vorontsov at linaro

May 7, 2012, 5:15 AM

Post #7 of 26 (279 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Mon, May 07, 2012 at 04:26:00AM -0400, KOSAKI Motohiro wrote:
> >> If we'll give up on "1." (Pekka, ping), then we need to solve "2."
> >> in a sane way: we'll have to add a 'NR_FILE_PAGES - NR_SHMEM -
> >> <todo-locked-file-pages>' attribute, and give it a name.
> >
> > Well, no, we can't give up on (1) completely. That'd mean that
> > eventually we'd need to change the ABI and break userspace. The
> > difference between exposing internal details and reasonable
> > abstractions is by no means black and white.
> >
> > AFAICT, RECLAIMABLE_CACHE_PAGES is a reasonable thing to support. Can
> > anyone come up with a reason why we couldn't do that in the future?
>
> It can. but the problem is, that is completely useless.

Surely it is useful. Could be not ideal, but you can't say that
it is completely useless.

> Because of, 1) dirty pages writing-out is sometimes very slow and

I don't see it as a unresolvable problem: we can exclude dirty pages,
that's a nice idea actually.

Easily reclaimable cache pages = file_pages - shmem - locked pages
- dirty pages.

The amount of dirty pages is configurable, which is also great.

Even more, we may introduce two attributes:

RECLAIMABLE_CACHE_PAGES and
RECLAIMABLE_CACHE_PAGES_NOIO (which excludes dirty pages).

This makes ABI detached from the mm internals and still keeps a
defined meaning of the attributes.

> 2) libc and some important library's pages are critical important
> for running a system even though it is clean and reclaimable. In other
> word, kernel don't have an info then can't expose it.

First off, I guess LRU would try to keep important/most used pages in
the cache, as we try to never fully drain page cache to the zero mark.

Secondly, if we're really low on memory (which low memory notifications
help to prevent) and kernel decided to throw libc's pages out of the
cache, you'll get cache miss and kernel will have to read it back. Well,
sometimes cache misses do happen, that's life. And if somebody really
don't want this for the essential parts of the system, one have to
mlock it (which eliminates your "kernel don't have an info" argument).


Btw, if you have any better strategy on helping userspace to define
'low memory' conditions, I'll readily try to implement it.

Thanks!

--
Anton Vorontsov
Email: cbouatmailru [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kosaki.motohiro at gmail

May 7, 2012, 12:19 PM

Post #8 of 26 (278 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

(5/7/12 8:15 AM), Anton Vorontsov wrote:
> On Mon, May 07, 2012 at 04:26:00AM -0400, KOSAKI Motohiro wrote:
>>>> If we'll give up on "1." (Pekka, ping), then we need to solve "2."
>>>> in a sane way: we'll have to add a 'NR_FILE_PAGES - NR_SHMEM -
>>>> <todo-locked-file-pages>' attribute, and give it a name.
>>>
>>> Well, no, we can't give up on (1) completely. That'd mean that
>>> eventually we'd need to change the ABI and break userspace. The
>>> difference between exposing internal details and reasonable
>>> abstractions is by no means black and white.
>>>
>>> AFAICT, RECLAIMABLE_CACHE_PAGES is a reasonable thing to support. Can
>>> anyone come up with a reason why we couldn't do that in the future?
>>
>> It can. but the problem is, that is completely useless.
>
> Surely it is useful. Could be not ideal, but you can't say that
> it is completely useless.

Why? It doesn't work.



>> Because of, 1) dirty pages writing-out is sometimes very slow and
>
> I don't see it as a unresolvable problem: we can exclude dirty pages,
> that's a nice idea actually.
>
> Easily reclaimable cache pages = file_pages - shmem - locked pages
> - dirty pages.
>
> The amount of dirty pages is configurable, which is also great.

You don't understand the issue. The point is NOT a formula. The problem
is, dirty and non-dirty pages aren't isolated in our kernel. Then, kernel
start to get stuck far before non-dirty pages become empty. Lie notification
always useless.


> Even more, we may introduce two attributes:
>
> RECLAIMABLE_CACHE_PAGES and
> RECLAIMABLE_CACHE_PAGES_NOIO (which excludes dirty pages).
>
> This makes ABI detached from the mm internals and still keeps a
> defined meaning of the attributes.

Collection of craps are also crap. If you want to improve userland
notification, you should join VM improvement activity. You shouldn't
think nobody except you haven't think userland notification feature.

The problem is, Any current kernel vm statistics were not created for
such purpose and don't fit.

Even though, some inaccurate and incorrect statistics fit _your_ usecase,
they definitely don't fit other. And their people think it is bug.


>> 2) libc and some important library's pages are critical important
>> for running a system even though it is clean and reclaimable. In other
>> word, kernel don't have an info then can't expose it.
>
> First off, I guess LRU would try to keep important/most used pages in
> the cache, as we try to never fully drain page cache to the zero mark.

Yes, what do you want say?


> Secondly, if we're really low on memory (which low memory notifications
> help to prevent) and kernel decided to throw libc's pages out of the
> cache, you'll get cache miss and kernel will have to read it back. Well,
> sometimes cache misses do happen, that's life. And if somebody really
> don't want this for the essential parts of the system, one have to
> mlock it (which eliminates your "kernel don't have an info" argument).

First off, "low memory" is very poor definition and we must not use it.
It is multiple meanings. 1) System free memory is low. Some embedded have userland
oom killer and they want to know _system_ status. 2) available memory is low.
This is different from (1) when using NUMA, memcg or cpusets. And in nowadays,
almost all x86 box have numa. This is userful for swap avoidance activity if
we can implement correctly.

Secondly, we can't assume someone mlock to libc. Because of, Linux is generic
purpose kernel. As far as you continue to talk about only user usecase, we can't
agree you. "Users may have a workaround" don't make excuse to accept broken patch.




> Btw, if you have any better strategy on helping userspace to define
> 'low memory' conditions, I'll readily try to implement it.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


anton.vorontsov at linaro

May 7, 2012, 5:31 PM

Post #9 of 26 (278 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Mon, May 07, 2012 at 03:19:50PM -0400, KOSAKI Motohiro wrote:
[...]
> You don't understand the issue.

Apparently.

> The point is NOT a formula. The problem
> is, dirty and non-dirty pages aren't isolated in our kernel. Then, kernel
> start to get stuck far before non-dirty pages become empty. Lie notification
> always useless.

Ugh. I don't get it (yeah, see above :-), in what sense they're not
isolated? In sense of isolate_lru_page and friends? Yes, they're not
isolated, but how that makes the notifications untrustworthy?

I'm confused. Can you elaborate a bit?

> >Even more, we may introduce two attributes:
> >
> >RECLAIMABLE_CACHE_PAGES and
> >RECLAIMABLE_CACHE_PAGES_NOIO (which excludes dirty pages).
> >
> >This makes ABI detached from the mm internals and still keeps a
> >defined meaning of the attributes.
>
> Collection of craps are also crap. If you want to improve userland
> notification, you should join VM improvement activity.

I'm all for improving VM, but please, be specific. I'm assuming
there is currently some efforts on VM improvements, which I'm
not aware of. Or there are some plans or thoughts on improvements --
please tell what are they.

> You shouldn't
> think nobody except you haven't think userland notification feature.

That was never my assumption; surely many people have worked on userland
notifications, and still, today we have none that would fully suite
Android's or Nokia's (or "embedded people's") needs, right? ;-)

So, let's try solve things.

Memcg is currently not usable for us, and I explained why (the slab
accounting for root cgroup thing: http://lkml.org/lkml/2012/4/30/115 ),
any comments?

> The problem is, Any current kernel vm statistics were not created for
> such purpose and don't fit.

OK, presuming current statistics don't fit, which ones should we
implement? How do you see it?

> Even though, some inaccurate and incorrect statistics fit _your_ usecase,
> they definitely don't fit other. And their people think it is bug.

I'm all for making a single solution for your and ours use cases, but
you don't say anything specific.

(Btw, what are your use cases?)

> >>2) libc and some important library's pages are critical important
> >>for running a system even though it is clean and reclaimable. In other
> >>word, kernel don't have an info then can't expose it.
> >
> >First off, I guess LRU would try to keep important/most used pages in
> >the cache, as we try to never fully drain page cache to the zero mark.

*1

> Yes, what do you want say?

> >Secondly, if we're really low on memory (which low memory notifications
> >help to prevent) and kernel decided to throw libc's pages out of the
> >cache, you'll get cache miss and kernel will have to read it back. Well,
> >sometimes cache misses do happen, that's life. And if somebody really
> >don't want this for the essential parts of the system, one have to
> >mlock it (which eliminates your "kernel don't have an info" argument).
>
> First off, "low memory" is very poor definition and we must not use it.

OK.

> It is multiple meanings.
> 1) System free memory is low. Some embedded have userland

The 'free' has multiple meanings as well. For us, it is
'not-used-at-all-pages + the-pages-that-we-can-get-in-a-
jiffy-and-not-disturb-things-much'.

The 'not-disturb-things-much' has a moot meaning as well, so all this
should probably be tunable. Cool, so let's give the userspace all the
needed statistics to decide on these meanings.

> oom killer and they want to know _system_ status. 2) available memory is low.
> This is different from (1) when using NUMA, memcg or cpusets. And in nowadays,
> almost all x86 box have numa. This is userful for swap avoidance activity if
> we can implement correctly.

I don't get it: you don't see '1)' as a use case? You're saying
that the meanings are different when using NUMA/memcg. If we don't
use memcg, what statistics should we use?

OK, if you are hinting that memcg should be mandatory for proper
statistics accounting, then please comment on the current memcg
issues, which don't let us do '1)' via '2)'.

> Secondly, we can't assume someone mlock to libc. Because of, Linux is generic
> purpose kernel.

You said that libc pages are important, implying that ideally they should
never leave the page cache (i.e. we should not count the pages as 'easily
reclaimable').

I answered that if are OK with "not guaranteed, but we'll do our best"
strategy, then just don't let fully drain the caches, and then LRU will
try keep "most important" pages (apparently libc) in the cache. *1 It
is surely userland's task to maintain the needed amount of memory, and
to do this efficiently we need..... notifications, that's right.

But if you want a guarantee, I guess mlock() is the only option -- it is
the only way to tell the kernel that the pages are really not to be
reclaimed.

So, in the light of 'easily reclaimable pages' statistics, what for was
your libc point again? How would you solve "the libc problem"?

Thanks!

--
Anton Vorontsov
Email: cbouatmailru [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 7, 2012, 10:20 PM

Post #10 of 26 (281 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Mon, May 7, 2012 at 10:19 PM, KOSAKI Motohiro
<kosaki.motohiro [at] gmail> wrote:
>> Even more, we may introduce two attributes:
>>
>> RECLAIMABLE_CACHE_PAGES and
>> RECLAIMABLE_CACHE_PAGES_NOIO (which excludes dirty pages).
>>
>> This makes ABI detached from the mm internals and still keeps a
>> defined meaning of the attributes.
>
> Collection of craps are also crap. If you want to improve userland
> notification, you should join VM improvement activity. You shouldn't
> think nobody except you haven't think userland notification feature.
>
> The problem is, Any current kernel vm statistics were not created for
> such purpose and don't fit.
>
> Even though, some inaccurate and incorrect statistics fit _your_ usecase,
> they definitely don't fit other. And their people think it is bug.

Well, yeah, if we are to report _number of pages_, the numbers better
be meaningful.

That said, I think you are being unfair to Anton who's one of the few
that's actually taking the time to implement this properly instead of
settling for an out-of-tree hack.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kosaki.motohiro at gmail

May 7, 2012, 10:42 PM

Post #11 of 26 (279 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 1:20 AM, Pekka Enberg <penberg [at] kernel> wrote:
> On Mon, May 7, 2012 at 10:19 PM, KOSAKI Motohiro
> <kosaki.motohiro [at] gmail> wrote:
>>> Even more, we may introduce two attributes:
>>>
>>> RECLAIMABLE_CACHE_PAGES and
>>> RECLAIMABLE_CACHE_PAGES_NOIO (which excludes dirty pages).
>>>
>>> This makes ABI detached from the mm internals and still keeps a
>>> defined meaning of the attributes.
>>
>> Collection of craps are also crap. If you want to improve userland
>> notification, you should join VM improvement activity. You shouldn't
>> think nobody except you haven't think userland notification feature.
>>
>> The problem is, Any current kernel vm statistics were not created for
>> such purpose and don't fit.
>>
>> Even though, some inaccurate and incorrect statistics fit _your_ usecase,
>> they definitely don't fit other. And their people think it is bug.
>
> Well, yeah, if we are to report _number of pages_, the numbers better
> be meaningful.
>
> That said, I think you are being unfair to Anton who's one of the few
> that's actually taking the time to implement this properly instead of
> settling for an out-of-tree hack.

Unfair? But only I can talk about technical comment. To be honest, I
really dislike
I need say the same explanation again and again. A lot of people don't read
past discussion. And as far as the patches take the same mistake, I must say
the same thing. It is just PITA.

I don't disagree vmevent notification itself, but I must disagree lie
notification.
And also, To make just idea statistics doesn't make sense at all. How do an
application choose the right events? If that depend on hardware configuration,
userland developers can't write proper applications.

That's the reason why I often disagree at random new features.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 7, 2012, 10:53 PM

Post #12 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 8:42 AM, KOSAKI Motohiro
<kosaki.motohiro [at] gmail> wrote:
>> That said, I think you are being unfair to Anton who's one of the few
>> that's actually taking the time to implement this properly instead of
>> settling for an out-of-tree hack.
>
> Unfair? But only I can talk about technical comment. To be honest, I
> really dislike
> I need say the same explanation again and again. A lot of people don't read
> past discussion. And as far as the patches take the same mistake, I must say
> the same thing. It is just PITA.

Unfair because you are trying to make it look as if Anton is only
concerned with his specific use case. That's simply not true.

On Tue, May 8, 2012 at 8:42 AM, KOSAKI Motohiro
<kosaki.motohiro [at] gmail> wrote:
> I don't disagree vmevent notification itself, but I must disagree lie
> notification.
> And also, To make just idea statistics doesn't make sense at all. How do an
> application choose the right events? If that depend on hardware configuration,
> userland developers can't write proper applications.

That's exactly the problem we're trying to tackle here! We _want_ the
ABI to provide sane, well-defined events that solve real world
problems.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


anton.vorontsov at linaro

May 7, 2012, 11:58 PM

Post #13 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 08, 2012 at 01:42:05AM -0400, KOSAKI Motohiro wrote:
[...]
> > Well, yeah, if we are to report _number of pages_, the numbers better
> > be meaningful.
> >
> > That said, I think you are being unfair to Anton who's one of the few
> > that's actually taking the time to implement this properly instead of
> > settling for an out-of-tree hack.
>
> Unfair? But only I can talk about technical comment. To be honest, I
> really dislike
> I need say the same explanation again and again. A lot of people don't read
> past discussion. And as far as the patches take the same mistake, I must say
> the same thing. It is just PITA.

Note that just telling people that something is PITA doesn't help solve
things (so people will come back to you with stupid questions over and
over again). You can call people morons, idiots and dumbasses (that's
all fine) but still finding a way to be productive. :-)

You could just give a link to a previous discussion, in which you think
you explained all your concerns regarding cache handling issues, or
memory notifications/statistics in general.

So, feel free to call me an idiot, but please expand your points a
little bit or give a link to the discussion you're referring to?

Thanks,

--
Anton Vorontsov
Email: cbouatmailru [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kosaki.motohiro at gmail

May 8, 2012, 12:11 AM

Post #14 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 1:53 AM, Pekka Enberg <penberg [at] kernel> wrote:
> On Tue, May 8, 2012 at 8:42 AM, KOSAKI Motohiro
> <kosaki.motohiro [at] gmail> wrote:
>>> That said, I think you are being unfair to Anton who's one of the few
>>> that's actually taking the time to implement this properly instead of
>>> settling for an out-of-tree hack.
>>
>> Unfair? But only I can talk about technical comment. To be honest, I
>> really dislike
>> I need say the same explanation again and again. A lot of people don't read
>> past discussion. And as far as the patches take the same mistake, I must say
>> the same thing. It is just PITA.
>
> Unfair because you are trying to make it look as if Anton is only
> concerned with his specific use case. That's simply not true.

However current proposal certainly don't refer past discuss and don't work
many environment.


> On Tue, May 8, 2012 at 8:42 AM, KOSAKI Motohiro
> <kosaki.motohiro [at] gmail> wrote:
>> I don't disagree vmevent notification itself, but I must disagree lie
>> notification.
>> And also, To make just idea statistics doesn't make sense at all. How do an
>> application choose the right events? If that depend on hardware configuration,
>> userland developers can't write proper applications.
>
> That's exactly the problem we're trying to tackle here! We _want_ the
> ABI to provide sane, well-defined events that solve real world
> problems.

Ok, sane. Then I take my time a little and review current vmevent code briefly.
(I read vmevent/core branch in pekka's tree. please let me know if
there is newer
repositry)

I think following thing should be fixed.

1) sample_period is brain damaged idea. If people ONLY need to
sampling stastics, they
only need to read /proc/vmstat periodically. just remove it and
implement push notification.
_IF_ someone need unfrequent level trigger, just use
"usleep(timeout); read(vmevent_fd)"
on userland code.
2) VMEVENT_ATTR_STATE_ONE_SHOT is misleading name. That is effect as
edge trigger
shot. not only once.
3) vmevent_fd() seems sane interface. but it has name space unaware.
maybe we discuss how
to harmonize name space feature. No hurry. but we have to think
that issue since at beginning.
4) Currently, vmstat have per-cpu batch and vmstat updating makes 3
second delay at maximum.
This is fine for usual case because almost userland watcher only
read /proc/vmstat per second.
But, for vmevent_fd() case, 3 seconds may be unacceptable delay. At
worst, 128 batch x 4096
x 4k pagesize = 2G bytes inaccurate is there.
5) __VMEVENT_ATTR_STATE_VALUE_WAS_LT should be removed from userland
exporting files.
When exporing kenrel internal, always silly gus used them and made unhappy.
6) Also vmevent_event must hide from userland.
7) vmevent_config::size must be removed. In 20th century, M$ API
prefer to use this technique. But
They dropped the way because a lot of application don't initialize
size member and they can't use
it for keeping upper compitibility.
8) memcg unaware
9) numa unaware
10) zone unaware

And, we may need vm internal change if we really need lowmem
notification. current kernel don't
have such info. _And_ there is one more big problem. Currently the
kernel maintain memory per
zone. But almost all userland application aren't aware zone nor node.
Thus raw notification aren't
useful for userland. In the other hands, total memory and total free
memory is useful? Definitely No!
Even though total free memory are lots, system may start swap out and
oom invokation. If we can't
oom invocation, this feature has serious raison d'etre issue. (i.e.
(4), (8), (9) and (19) are not ignorable
issue. I think)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kosaki.motohiro at gmail

May 8, 2012, 12:16 AM

Post #15 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

(5/8/12 2:58 AM), Anton Vorontsov wrote:
> On Tue, May 08, 2012 at 01:42:05AM -0400, KOSAKI Motohiro wrote:
> [...]
>>> Well, yeah, if we are to report _number of pages_, the numbers better
>>> be meaningful.
>>>
>>> That said, I think you are being unfair to Anton who's one of the few
>>> that's actually taking the time to implement this properly instead of
>>> settling for an out-of-tree hack.
>>
>> Unfair? But only I can talk about technical comment. To be honest, I
>> really dislike
>> I need say the same explanation again and again. A lot of people don't read
>> past discussion. And as far as the patches take the same mistake, I must say
>> the same thing. It is just PITA.
>
> Note that just telling people that something is PITA doesn't help solve
> things (so people will come back to you with stupid questions over and
> over again). You can call people morons, idiots and dumbasses (that's
> all fine) but still finding a way to be productive. :-)
>
> You could just give a link to a previous discussion, in which you think
> you explained all your concerns regarding cache handling issues, or
> memory notifications/statistics in general.
>
> So, feel free to call me an idiot, but please expand your points a
> little bit or give a link to the discussion you're referring to?

I don't think you are idiot. But I hope you test your patch before submitting.
That just don't work especially on x86. Because of, all x86 box have multiple zone
and summarized statistics (i.e. global_page_state() thing) don't work and can't
prevent oom nor swapping.

and please see may previous mail.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 8, 2012, 12:36 AM

Post #16 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 10:11 AM, KOSAKI Motohiro
<kosaki.motohiro [at] gmail> wrote:
> Ok, sane. Then I take my time a little and review current vmevent code briefly.
> (I read vmevent/core branch in pekka's tree. please let me know if
> there is newer repositry)

It's the latest one.

On Tue, May 8, 2012 at 10:11 AM, KOSAKI Motohiro
<kosaki.motohiro [at] gmail> wrote:
> 1) sample_period is brain damaged idea. If people ONLY need to
> sampling stastics, they
>  only need to read /proc/vmstat periodically. just remove it and
> implement push notification.
>  _IF_ someone need unfrequent level trigger, just use
> "usleep(timeout); read(vmevent_fd)"
>  on userland code.

That comes from a real-world requirement. See Leonid's email on the topic:

https://lkml.org/lkml/2012/5/2/42

> 2) VMEVENT_ATTR_STATE_ONE_SHOT is misleading name. That is effect as
> edge trigger shot. not only once.

Would VMEVENT_ATTR_STATE_EDGE_TRIGGER be a better name?

> 3) vmevent_fd() seems sane interface. but it has name space unaware.
> maybe we discuss how to harmonize name space feature.  No hurry. but we have
> to think that issue since at beginning.

You mean VFS namespaces? Yeah, we need to take care of that.

> 4) Currently, vmstat have per-cpu batch and vmstat updating makes 3
> second delay at maximum.
>  This is fine for usual case because almost userland watcher only
> read /proc/vmstat per second.
>  But, for vmevent_fd() case, 3 seconds may be unacceptable delay. At
> worst, 128 batch x 4096
>  x 4k pagesize = 2G bytes inaccurate is there.

That's pretty awful. Anton, Leonid, comments?

> 5) __VMEVENT_ATTR_STATE_VALUE_WAS_LT should be removed from userland
> exporting files.
>  When exporing kenrel internal, always silly gus used them and made unhappy.

Agreed. Anton, care to cook up a patch to do that?

> 6) Also vmevent_event must hide from userland.

Why? That's part of the ABI.

> 7) vmevent_config::size must be removed. In 20th century, M$ API
> prefer to use this technique. But
>  They dropped the way because a lot of application don't initialize
> size member and they can't use it for keeping upper compitibility.

It's there to support forward/backward ABI compatibility like perf
does. I'm going to keep it for now but I'm open to dropping it when
the ABI is more mature.

> 8) memcg unaware
> 9) numa unaware
> 10) zone unaware

Yup.

> And, we may need vm internal change if we really need lowmem
> notification. current kernel don't have such info. _And_ there is one more
> big problem. Currently the kernel maintain memory per
> zone. But almost all userland application aren't aware zone nor node.
> Thus raw notification aren't useful for userland. In the other hands, total
> memory and total free memory is useful? Definitely No!
> Even though total free memory are lots, system may start swap out and
> oom invokation. If we can't oom invocation, this feature has serious raison
> d'etre issue. (i.e. (4), (8), (9) and (19) are not ignorable issue. I think)

I'm guessing most of the existing solutions get away with
approximations and soft limits because they're mostly used on UMA
embedded machines.

But yes, we need to do better here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kosaki.motohiro at gmail

May 8, 2012, 12:50 AM

Post #17 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

(5/8/12 3:36 AM), Pekka Enberg wrote:
> On Tue, May 8, 2012 at 10:11 AM, KOSAKI Motohiro
> <kosaki.motohiro [at] gmail> wrote:
>> Ok, sane. Then I take my time a little and review current vmevent code briefly.
>> (I read vmevent/core branch in pekka's tree. please let me know if
>> there is newer repositry)
>
> It's the latest one.
>
> On Tue, May 8, 2012 at 10:11 AM, KOSAKI Motohiro
> <kosaki.motohiro [at] gmail> wrote:
>> 1) sample_period is brain damaged idea. If people ONLY need to
>> sampling stastics, they
>> only need to read /proc/vmstat periodically. just remove it and
>> implement push notification.
>> _IF_ someone need unfrequent level trigger, just use
>> "usleep(timeout); read(vmevent_fd)"
>> on userland code.
>
> That comes from a real-world requirement. See Leonid's email on the topic:
>
> https://lkml.org/lkml/2012/5/2/42

I know, many embedded guys prefer such timer interval. I also have an experience
similar logic when I was TV box developer. but I must disagree. Someone hope
timer housekeeping complexity into kernel. but I haven't seen any justification.


>> 2) VMEVENT_ATTR_STATE_ONE_SHOT is misleading name. That is effect as
>> edge trigger shot. not only once.
>
> Would VMEVENT_ATTR_STATE_EDGE_TRIGGER be a better name?

maybe.


>> 3) vmevent_fd() seems sane interface. but it has name space unaware.
>> maybe we discuss how to harmonize name space feature. No hurry. but we have
>> to think that issue since at beginning.
>
> You mean VFS namespaces? Yeah, we need to take care of that.

If we keep current vmevent_fd() design, we may need to create new namespace concept
likes ipc namespace. current vmevent_fd() is not VFS based.


>> 4) Currently, vmstat have per-cpu batch and vmstat updating makes 3
>> second delay at maximum.
>> This is fine for usual case because almost userland watcher only
>> read /proc/vmstat per second.
>> But, for vmevent_fd() case, 3 seconds may be unacceptable delay. At
>> worst, 128 batch x 4096
>> x 4k pagesize = 2G bytes inaccurate is there.
>
> That's pretty awful. Anton, Leonid, comments?
>
>> 5) __VMEVENT_ATTR_STATE_VALUE_WAS_LT should be removed from userland
>> exporting files.
>> When exporing kenrel internal, always silly gus used them and made unhappy.
>
> Agreed. Anton, care to cook up a patch to do that?
>
>> 6) Also vmevent_event must hide from userland.
>
> Why? That's part of the ABI.

Ahhh, if so, I missed something. as far as I look, vmevent_fd() only depend
on vmevent_config. which syscall depend on vmevent_evennt?



>> 7) vmevent_config::size must be removed. In 20th century, M$ API
>> prefer to use this technique. But
>> They dropped the way because a lot of application don't initialize
>> size member and they can't use it for keeping upper compitibility.
>
> It's there to support forward/backward ABI compatibility like perf
> does. I'm going to keep it for now but I'm open to dropping it when
> the ABI is more mature.

perf api is not intended to use from generic applications. then, I don't
think it will make abi issue. tool/perf is sane, isn't it? but vmevent_fd()
is generic api and we can't trust all userland guy have sane, unfortunately.

>> 8) memcg unaware
>> 9) numa unaware
>> 10) zone unaware
>
> Yup.
>
>> And, we may need vm internal change if we really need lowmem
>> notification. current kernel don't have such info. _And_ there is one more
>> big problem. Currently the kernel maintain memory per
>> zone. But almost all userland application aren't aware zone nor node.
>> Thus raw notification aren't useful for userland. In the other hands, total
>> memory and total free memory is useful? Definitely No!
>> Even though total free memory are lots, system may start swap out and
>> oom invokation. If we can't oom invocation, this feature has serious raison
>> d'etre issue. (i.e. (4), (8), (9) and (19) are not ignorable issue. I think)
>
> I'm guessing most of the existing solutions get away with
> approximations and soft limits because they're mostly used on UMA
> embedded machines.
>
> But yes, we need to do better here.

Hm. If you want vmevent makes depend on CONFIG_EMBEDDED, I have no reason to
complain this feature. At that world, almost all applications _know_ their
system configuration. then I don't think api misuse issue is big matter.





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 8, 2012, 1:03 AM

Post #18 of 26 (278 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 10:50 AM, KOSAKI Motohiro
<kosaki.motohiro [at] gmail> wrote:
>>> 1) sample_period is brain damaged idea. If people ONLY need to
>>> sampling stastics, they
>>>  only need to read /proc/vmstat periodically. just remove it and
>>> implement push notification.
>>>  _IF_ someone need unfrequent level trigger, just use
>>> "usleep(timeout); read(vmevent_fd)"
>>>  on userland code.
>>
>> That comes from a real-world requirement. See Leonid's email on the topic:
>>
>> https://lkml.org/lkml/2012/5/2/42
>
> I know, many embedded guys prefer such timer interval. I also have an
> experience
> similar logic when I was TV box developer. but I must disagree. Someone hope
> timer housekeeping complexity into kernel. but I haven't seen any
> justification.

Leonid?

>>> 6) Also vmevent_event must hide from userland.
>>
>> Why? That's part of the ABI.
>
> Ahhh, if so, I missed something. as far as I look, vmevent_fd() only depend
> on vmevent_config. which syscall depend on vmevent_evennt?

read(2). See tools/testing/vmevent/vmevent-test.c for an example how
the ABI is used.

>>> 7) vmevent_config::size must be removed. In 20th century, M$ API
>>> prefer to use this technique. But
>>>  They dropped the way because a lot of application don't initialize
>>> size member and they can't use it for keeping upper compitibility.
>>
>> It's there to support forward/backward ABI compatibility like perf
>> does. I'm going to keep it for now but I'm open to dropping it when
>> the ABI is more mature.
>
> perf api is not intended to use from generic applications. then, I don't
> think it will make abi issue. tool/perf is sane, isn't it? but vmevent_fd()
> is generic api and we can't trust all userland guy have sane, unfortunately.

The perf ABI is being used by other projects as well. We don't
_depend_ on ->size being sane as much as use it to detect new features
in the future.

But anyway, we can consider dropping once the ABI is more stable.

> Hm. If you want vmevent makes depend on CONFIG_EMBEDDED, I have no reason to
> complain this feature. At that world, almost all applications _know_ their
> system configuration. then I don't think api misuse issue is big matter.

No, I don't want to do that. I was just commeting on why existing
solutions are the way they are.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


anton.vorontsov at linaro

May 8, 2012, 1:13 AM

Post #19 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 08, 2012 at 03:16:59AM -0400, KOSAKI Motohiro wrote:
[...]
> >So, feel free to call me an idiot, but please expand your points a
> >little bit or give a link to the discussion you're referring to?
>
> I don't think you are idiot. But I hope you test your patch before submitting.
> That just don't work especially on x86. Because of, all x86 box have multiple zone
> and summarized statistics (i.e. global_page_state() thing) don't work and can't
> prevent oom nor swapping.

Now I think I understand you: we don't take into account that e.g. DMA
zone is not usable by the normal allocations, and so if we're basing our
calculations on summarized stats, it is indeed possible to get an OOM
in such a case. On Android kernels nobody noticed the issue as it is
used mainly on ARM targets, which might have CONFIG_ZONE_DMA=n.

So, this particular issue makes sense now.

Thanks!

--
Anton Vorontsov
Email: cbouatmailru [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


anton.vorontsov at linaro

May 8, 2012, 1:21 AM

Post #20 of 26 (282 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 08, 2012 at 01:13:05AM -0700, Anton Vorontsov wrote:
> On Tue, May 08, 2012 at 03:16:59AM -0400, KOSAKI Motohiro wrote:
> [...]
> > >So, feel free to call me an idiot, but please expand your points a
> > >little bit or give a link to the discussion you're referring to?
> >
> > I don't think you are idiot. But I hope you test your patch before submitting.
> > That just don't work especially on x86. Because of, all x86 box have multiple zone
> > and summarized statistics (i.e. global_page_state() thing) don't work and can't
> > prevent oom nor swapping.
>
> Now I think I understand you: we don't take into account that e.g. DMA
> zone is not usable by the normal allocations, and so if we're basing our
> calculations on summarized stats, it is indeed possible to get an OOM
> in such a case.

Oops. Looking into it more, I think I was wrong here: kernel will surely
use pages from the DMA zone when we have no pages in normal zones.

So, I don't see how we can get OOM in that case.

Hm.

--
Anton Vorontsov
Email: cbouatmailru [at] gmail
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

May 8, 2012, 1:32 AM

Post #21 of 26 (280 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On 05/08/2012 04:11 PM, KOSAKI Motohiro wrote:

> On Tue, May 8, 2012 at 1:53 AM, Pekka Enberg <penberg [at] kernel> wrote:
>> On Tue, May 8, 2012 at 8:42 AM, KOSAKI Motohiro
>> <kosaki.motohiro [at] gmail> wrote:
>>>> That said, I think you are being unfair to Anton who's one of the few
>>>> that's actually taking the time to implement this properly instead of
>>>> settling for an out-of-tree hack.
>>>
>>> Unfair? But only I can talk about technical comment. To be honest, I
>>> really dislike
>>> I need say the same explanation again and again. A lot of people don't read
>>> past discussion. And as far as the patches take the same mistake, I must say
>>> the same thing. It is just PITA.
>>
>> Unfair because you are trying to make it look as if Anton is only
>> concerned with his specific use case. That's simply not true.
>
> However current proposal certainly don't refer past discuss and don't work
> many environment.
>
>
>> On Tue, May 8, 2012 at 8:42 AM, KOSAKI Motohiro
>> <kosaki.motohiro [at] gmail> wrote:
>>> I don't disagree vmevent notification itself, but I must disagree lie
>>> notification.
>>> And also, To make just idea statistics doesn't make sense at all. How do an
>>> application choose the right events? If that depend on hardware configuration,
>>> userland developers can't write proper applications.
>>
>> That's exactly the problem we're trying to tackle here! We _want_ the
>> ABI to provide sane, well-defined events that solve real world
>> problems.
>
> Ok, sane. Then I take my time a little and review current vmevent code briefly.
> (I read vmevent/core branch in pekka's tree. please let me know if
> there is newer
> repositry)
>
> I think following thing should be fixed.
>
> 1) sample_period is brain damaged idea. If people ONLY need to
> sampling stastics, they
> only need to read /proc/vmstat periodically. just remove it and
> implement push notification.
> _IF_ someone need unfrequent level trigger, just use
> "usleep(timeout); read(vmevent_fd)"
> on userland code.
> 2) VMEVENT_ATTR_STATE_ONE_SHOT is misleading name. That is effect as
> edge trigger
> shot. not only once.
> 3) vmevent_fd() seems sane interface. but it has name space unaware.
> maybe we discuss how
> to harmonize name space feature. No hurry. but we have to think
> that issue since at beginning.
> 4) Currently, vmstat have per-cpu batch and vmstat updating makes 3
> second delay at maximum.
> This is fine for usual case because almost userland watcher only
> read /proc/vmstat per second.
> But, for vmevent_fd() case, 3 seconds may be unacceptable delay. At
> worst, 128 batch x 4096
> x 4k pagesize = 2G bytes inaccurate is there.
> 5) __VMEVENT_ATTR_STATE_VALUE_WAS_LT should be removed from userland
> exporting files.
> When exporing kenrel internal, always silly gus used them and made unhappy.
> 6) Also vmevent_event must hide from userland.
> 7) vmevent_config::size must be removed. In 20th century, M$ API
> prefer to use this technique. But
> They dropped the way because a lot of application don't initialize
> size member and they can't use
> it for keeping upper compitibility.
> 8) memcg unaware
> 9) numa unaware
> 10) zone unaware


I would like to add a concern.

11) understand storage speed.

As I mentioned, system can have various storage type(SSD, disk, eMMC, ramfs)
In some system, user can tolerate ramfs and SSD write or swapout.
We should consdier that to make it really useful.

The problem is user can't know it in advance so it should be detected by kernel.
Unfortunately, it's not easy now.

The idea is that we can make some levels in advane and explain it to user

Level 1: It a immediate response to user when kernel decide there are not fast-reclaimable pages any more.
Level 2: It's rather slower response than level 1 but kernel will consider it as reclaimable target
Level 3: It's slowest response because kernel will consider page needed long time to reclaim as reclaimable target.

It doesn't expose any internal of kernel and can implment it in internal.
For simple example,

Level 1: non-mapped clean page
Level 2: Level 1 + mapped clean-page
Level 3: Level 2 + dirty pages

So users of vmevent_fd can select his level.
Of course, latency sensitive application with slow stoarge would select Level 1.
Some application might use Level 4(Level 3 + half of swap) because it has very fast storage.

And application receives event can make strategy folloing as.

When it receives level 1 notification, it could request to others if it can release their own buffers.
When it receives level 2 notification, it could request to suicide if it's not critical application.
When it receives level 3 notification, it could kill others.

It's a just example and my point is we should storage speed to make it general.

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


leonid.moiseichuk at nokia

May 8, 2012, 2:15 AM

Post #22 of 26 (282 views)
Permalink
RE: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

> -----Original Message-----
> From: penberg [at] gmail [mailto:penberg [at] gmail] On Behalf Of ext
> Pekka Enberg
> Sent: 08 May, 2012 11:03
> To: KOSAKI Motohiro
> Cc: Anton Vorontsov; Minchan Kim; Moiseichuk Leonid (Nokia-MP/Espoo); John
...
> >> That comes from a real-world requirement. See Leonid's email on the topic:
> >>
> >> https://lkml.org/lkml/2012/5/2/42
> >
> > I know, many embedded guys prefer such timer interval. I also have an
> > experience similar logic when I was TV box developer. but I must
> > disagree. Someone hope timer housekeeping complexity into kernel. but
> > I haven't seen any justification.
>
> Leonid?

The "usleep(timeout); read(vmevent_fd)" will eliminate opportunity to use vmevent API for mobile devices.
Developers already have to use heartbeat primitives to align/sync timers and update code which is not always simple to do.
But the idea is to have user-space wakeup only if we have something change in memory numbers, thus aligned timers will not help much in vmevent case due to memory situation may change a lot in short time.
Short depends from software stack but usually it below 1s. To have use-time and wakeups on good level (below 50Hz by e.g. powertop) and allow cpu switch off timers of such short period like 1s are not allowed.

Leonid
PS: Sorry, meetings prevent to do interesting things :( I am tracking conversation with quite low understanding how it will be useful for practical needs because user-space developers in 80% cases needs to track simply dirty memory changes i.e. modified pages which cannot be dropped.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 8, 2012, 2:19 AM

Post #23 of 26 (282 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 12:15 PM, <leonid.moiseichuk [at] nokia> wrote:
> I am tracking conversation with quite low understanding how it will be useful for
> practical needs because user-space developers in 80% cases needs to track
> simply dirty memory changes i.e. modified pages which cannot be dropped.

The point is to support those cases in such a way that works sanely
across different architectures and configurations.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

May 8, 2012, 2:27 AM

Post #24 of 26 (259 views)
Permalink
Re: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

On Tue, May 8, 2012 at 11:32 AM, Minchan Kim <minchan [at] kernel> wrote:
> The idea is that we can make some levels in advane and explain it to user
>
> Level 1: It a immediate response to user when kernel decide there are not fast-reclaimable pages any more.
> Level 2: It's rather slower response than level 1 but kernel will consider it as reclaimable target
> Level 3: It's slowest response because kernel will consider page needed long time to reclaim as reclaimable target.
>
> It doesn't expose any internal of kernel and can implment it in internal.
> For simple example,
>
> Level 1: non-mapped clean page
> Level 2: Level 1 + mapped clean-page
> Level 3: Level 2 + dirty pages
>
> So users of vmevent_fd can select his level.

I'm totally OK with pursuing something like this if people like Leonid
and Anton think it's useful for their use-cases.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


leonid.moiseichuk at nokia

May 8, 2012, 3:38 AM

Post #25 of 26 (256 views)
Permalink
RE: [PATCH 3/3] vmevent: Implement special low-memory attribute [In reply to]

> -----Original Message-----
> From: penberg [at] gmail [mailto:penberg [at] gmail] On Behalf Of ext
> Pekka Enberg
> Sent: 08 May, 2012 12:20
...
> On Tue, May 8, 2012 at 12:15 PM, <leonid.moiseichuk [at] nokia> wrote:
> > I am tracking conversation with quite low understanding how it will be
> > useful for practical needs because user-space developers in 80% cases
> > needs to track simply dirty memory changes i.e. modified pages which cannot
> be dropped.
>
> The point is to support those cases in such a way that works sanely across
> different architectures and configurations.

Which usually means you need to increase level of abstraction from hugepages, swapped, kernel reclaimable, slab allocated, active memory to used, cache and must-have memory which are common for all platforms. Do you have visibility what do you need to show and how do you will calculate it? Does it possible to do for system, group of processes or particular process?

I implemented system-wide variant because that was a simplest one and most urgent I need. But e.g. to say how much memory consumed particular process in Linux you need to use smaps. Probably vmevent need to have some scratches (aka roadmap) into this direction as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.