Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH 0/4] promote zcache from staging

 

 

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded


sjenning at linux

Jul 27, 2012, 11:18 AM

Post #1 of 36 (984 views)
Permalink
[PATCH 0/4] promote zcache from staging

zcache is the remaining piece of code required to support in-kernel
memory compression. The other two features, cleancache and frontswap,
have been promoted to mainline in 3.0 and 3.5. This patchset
promotes zcache from the staging tree to mainline.

Based on the level of activity and contributions we're seeing from a
diverse set of people and interests, I think zcache has matured to the
point where it makes sense to promote this out of staging.

Overview
========
zcache is a backend to frontswap and cleancache that accepts pages from
those mechanisms and compresses them, leading to reduced I/O caused by
swap and file re-reads. This is very valuable in shared storage situations
to reduce load on things like SANs. Also, in the case of slow backing/swap
devices, zcache can also yield a performance gain.

In-Kernel Memory Compression Overview:

swap subsystem page cache
+ +
frontswap cleancache
+ +
zcache frontswap glue zcache cleancache glue
+ +
+---------+------------+
+
zcache/tmem core
+
+---------+------------+
+ +
zsmalloc zbud

Everything below the frontswap/cleancache layer is current inside the
zcache driver expect for zsmalloc which is a shared between zcache and
another memory compression driver, zram.

Since zcache is dependent on zsmalloc, it is also being promoted by this
patchset.

For information on zsmalloc and the rationale behind it's design and use
cases verses already existing allocators in the kernel:

https://lkml.org/lkml/2012/1/9/386

zsmalloc is the allocator used by zcache to store persistent pages that
comes from frontswap, as opposed to zbud which is the (internal) allocator
used for ephemeral pages from cleancache.

zsmalloc uses many fields of the page struct to create it's conceptual
high-order page called a zspage. Exactly which fields are used and for
what purpose is documented at the top of the zsmalloc .c file. Because
zsmalloc uses struct page extensively, Andrew advised that the
promotion location be mm/:

https://lkml.org/lkml/2012/1/20/308

Some benchmarking numbers demonstrating the I/O saving that can be had
with zcache:

https://lkml.org/lkml/2012/3/22/383

Dan's presentation at LSF/MM this year on zcache:

http://oss.oracle.com/projects/tmem/dist/documentation/presentations/LSFMM12-zcache-final.pdf

This patchset is based on next-20120727 + 3-part zsmalloc patchset below

https://lkml.org/lkml/2012/7/18/353

The zsmalloc patchset is already acked and will be integrated by Greg after
3.6-rc1 is out.

Seth Jennings (4):
zsmalloc: collapse internal .h into .c
zsmalloc: promote to mm/
drivers: add memory management driver class
zcache: promote to drivers/mm/

drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/mm/Kconfig | 13 ++
drivers/mm/Makefile | 1 +
drivers/{staging => mm}/zcache/Makefile | 0
drivers/{staging => mm}/zcache/tmem.c | 0
drivers/{staging => mm}/zcache/tmem.h | 0
drivers/{staging => mm}/zcache/zcache-main.c | 4 +-
drivers/staging/Kconfig | 4 -
drivers/staging/Makefile | 2 -
drivers/staging/zcache/Kconfig | 11 --
drivers/staging/zram/zram_drv.h | 3 +-
drivers/staging/zsmalloc/Kconfig | 10 --
drivers/staging/zsmalloc/Makefile | 3 -
drivers/staging/zsmalloc/zsmalloc_int.h | 149 --------------------
.../staging/zsmalloc => include/linux}/zsmalloc.h | 0
mm/Kconfig | 18 +++
mm/Makefile | 1 +
.../zsmalloc/zsmalloc-main.c => mm/zsmalloc.c | 133 ++++++++++++++++-
19 files changed, 170 insertions(+), 185 deletions(-)
create mode 100644 drivers/mm/Kconfig
create mode 100644 drivers/mm/Makefile
rename drivers/{staging => mm}/zcache/Makefile (100%)
rename drivers/{staging => mm}/zcache/tmem.c (100%)
rename drivers/{staging => mm}/zcache/tmem.h (100%)
rename drivers/{staging => mm}/zcache/zcache-main.c (99%)
delete mode 100644 drivers/staging/zcache/Kconfig
delete mode 100644 drivers/staging/zsmalloc/Kconfig
delete mode 100644 drivers/staging/zsmalloc/Makefile
delete mode 100644 drivers/staging/zsmalloc/zsmalloc_int.h
rename {drivers/staging/zsmalloc => include/linux}/zsmalloc.h (100%)
rename drivers/staging/zsmalloc/zsmalloc-main.c => mm/zsmalloc.c (86%)

--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Jul 27, 2012, 12:21 PM

Post #2 of 36 (968 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> From: Seth Jennings [mailto:sjenning [at] linux]
> Subject: [PATCH 0/4] promote zcache from staging
>
> zcache is the remaining piece of code required to support in-kernel
> memory compression. The other two features, cleancache and frontswap,
> have been promoted to mainline in 3.0 and 3.5. This patchset
> promotes zcache from the staging tree to mainline.
>
> Based on the level of activity and contributions we're seeing from a
> diverse set of people and interests, I think zcache has matured to the
> point where it makes sense to promote this out of staging.

Hi Seth --

Per offline communication, I'd like to see this delayed for three
reasons:

1) I've completely rewritten zcache and will post the rewrite soon.
The redesigned code fixes many of the weaknesses in zcache that
makes it (IMHO) unsuitable for an enterprise distro. (Some of
these previously discussed in linux-mm [1].)
2) zcache is truly mm (memory management) code and the fact that
it is in drivers at all was purely for logistical reasons
(e.g. the only in-tree "staging" is in the drivers directory).
My rewrite promotes it to (a subdirectory of) mm where IMHO it
belongs.
3) Ramster heavily duplicates code from zcache. My rewrite resolves
this. My soon-to-be-post also places the re-factored ramster
in mm, though with some minor work zcache could go in mm and
ramster could stay in staging.

Let's have this discussion, but unless the community decides
otherwise, please consider this a NACK.

Thanks,
Dan

[1] http://marc.info/?t=133886706700002&r=1&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


konrad at darnok

Jul 27, 2012, 1:59 PM

Post #3 of 36 (973 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote:
> > From: Seth Jennings [mailto:sjenning [at] linux]
> > Subject: [PATCH 0/4] promote zcache from staging
> >
> > zcache is the remaining piece of code required to support in-kernel
> > memory compression. The other two features, cleancache and frontswap,
> > have been promoted to mainline in 3.0 and 3.5. This patchset
> > promotes zcache from the staging tree to mainline.
> >
> > Based on the level of activity and contributions we're seeing from a
> > diverse set of people and interests, I think zcache has matured to the
> > point where it makes sense to promote this out of staging.
>
> Hi Seth --
>
> Per offline communication, I'd like to see this delayed for three
> reasons:
>
> 1) I've completely rewritten zcache and will post the rewrite soon.
> The redesigned code fixes many of the weaknesses in zcache that
> makes it (IMHO) unsuitable for an enterprise distro. (Some of
> these previously discussed in linux-mm [1].)
> 2) zcache is truly mm (memory management) code and the fact that
> it is in drivers at all was purely for logistical reasons
> (e.g. the only in-tree "staging" is in the drivers directory).
> My rewrite promotes it to (a subdirectory of) mm where IMHO it
> belongs.
> 3) Ramster heavily duplicates code from zcache. My rewrite resolves
> this. My soon-to-be-post also places the re-factored ramster
> in mm, though with some minor work zcache could go in mm and
> ramster could stay in staging.
>
> Let's have this discussion, but unless the community decides
> otherwise, please consider this a NACK.

Hold on, that is rather unfair. The zcache has been in staging
for quite some time - your code has not been posted. Part of
"unstaging" a driver is for folks to review the code - and you
just said "No, mine is better" without showing your goods.

There is a third option - which is to continue the promotion
of zcache from staging, get reviews, work on them ,etc, and
alongside of that you can work on fixing up (or ripping out)
zcache1 with zcache2 components as they make sense. Or even
having two of them - an enterprise and an embedded version
that will eventually get merged together. There is nothing
wrong with modifying a driver once it has left staging.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Jul 27, 2012, 2:42 PM

Post #4 of 36 (968 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> From: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
> Sent: Friday, July 27, 2012 3:00 PM
> Subject: Re: [PATCH 0/4] promote zcache from staging
>
> On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote:
> > > From: Seth Jennings [mailto:sjenning [at] linux]
> > > Subject: [PATCH 0/4] promote zcache from staging
> > >
> > > zcache is the remaining piece of code required to support in-kernel
> > > memory compression. The other two features, cleancache and frontswap,
> > > have been promoted to mainline in 3.0 and 3.5. This patchset
> > > promotes zcache from the staging tree to mainline.
> > >
> > > Based on the level of activity and contributions we're seeing from a
> > > diverse set of people and interests, I think zcache has matured to the
> > > point where it makes sense to promote this out of staging.
> >
> > Hi Seth --
> >
> > Per offline communication, I'd like to see this delayed for three
> > reasons:
> >
> > 1) I've completely rewritten zcache and will post the rewrite soon.
> > The redesigned code fixes many of the weaknesses in zcache that
> > makes it (IMHO) unsuitable for an enterprise distro. (Some of
> > these previously discussed in linux-mm [1].)
> > 2) zcache is truly mm (memory management) code and the fact that
> > it is in drivers at all was purely for logistical reasons
> > (e.g. the only in-tree "staging" is in the drivers directory).
> > My rewrite promotes it to (a subdirectory of) mm where IMHO it
> > belongs.
> > 3) Ramster heavily duplicates code from zcache. My rewrite resolves
> > this. My soon-to-be-post also places the re-factored ramster
> > in mm, though with some minor work zcache could go in mm and
> > ramster could stay in staging.
> >
> > Let's have this discussion, but unless the community decides
> > otherwise, please consider this a NACK.

Hi Konrad --

> Hold on, that is rather unfair. The zcache has been in staging
> for quite some time - your code has not been posted. Part of
> "unstaging" a driver is for folks to review the code - and you
> just said "No, mine is better" without showing your goods.

Sorry, I'm not trying to be unfair. However, I don't see the point
of promoting zcache out of staging unless it is intended to be used
by real users in a real distro. There's been a lot of discussion,
onlist and offlist, about what needs to be fixed in zcache and not
much visible progress on fixing it. But fixing it is where I've spent
most of my time over the last couple of months.

If IBM or some other company or distro is eager to ship and support
zcache in its current form, I agree that "promote now, improve later"
is a fine approach. But promoting zcache out of staging simply because
there is urgency to promote zsmalloc+zram out of staging doesn't
seem wise. At a minimum, it distracts reviewers/effort from what IMHO
is required to turn zcache into an enterprise-ready kernel feature.

I can post my "goods" anytime. In its current form it is better
than the zcache in staging (and, please remember, I wrote both so
I think I am in a good position to compare the two).
I have been waiting until I think the new zcache is feature complete
before asking for review, especially since the newest features
should demonstrate clearly why the rewrite is necessary and
beneficial. But I can post* my current bits if people don't
believe they exist and/or don't mind reviewing non-final code.
(* Or I can put them in a publicly available git tree.)

> There is a third option - which is to continue the promotion
> of zcache from staging, get reviews, work on them ,etc, and
> alongside of that you can work on fixing up (or ripping out)
> zcache1 with zcache2 components as they make sense. Or even
> having two of them - an enterprise and an embedded version
> that will eventually get merged together. There is nothing
> wrong with modifying a driver once it has left staging.

Minchan and Seth can correct me if I am wrong, but I believe
zram+zsmalloc, not zcache, is the target solution for embedded.
The limitations of zsmalloc aren't an issue for zram but they are
for zcache, and this deficiency was one of the catalysts for the
rewrite. The issues are explained in more detail in [1],
but if any point isn't clear, I'd be happy to explain further.

However, I have limited time for this right now and I'd prefer
to spend it finishing the code. :-}

So, as I said, I am still a NACK, but if there are good reasons
to duplicate effort and pursue the "third option", let's discuss
them.

Thanks,
Dan

[1] http://marc.info/?t=133886706700002&r=1&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

Jul 28, 2012, 6:54 PM

Post #5 of 36 (974 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Fri, Jul 27, 2012 at 02:42:14PM -0700, Dan Magenheimer wrote:
> > From: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
> > Sent: Friday, July 27, 2012 3:00 PM
> > Subject: Re: [PATCH 0/4] promote zcache from staging
> >
> > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote:
> > > > From: Seth Jennings [mailto:sjenning [at] linux]
> > > > Subject: [PATCH 0/4] promote zcache from staging
> > > >
> > > > zcache is the remaining piece of code required to support in-kernel
> > > > memory compression. The other two features, cleancache and frontswap,
> > > > have been promoted to mainline in 3.0 and 3.5. This patchset
> > > > promotes zcache from the staging tree to mainline.
> > > >
> > > > Based on the level of activity and contributions we're seeing from a
> > > > diverse set of people and interests, I think zcache has matured to the
> > > > point where it makes sense to promote this out of staging.
> > >
> > > Hi Seth --
> > >
> > > Per offline communication, I'd like to see this delayed for three
> > > reasons:
> > >
> > > 1) I've completely rewritten zcache and will post the rewrite soon.
> > > The redesigned code fixes many of the weaknesses in zcache that
> > > makes it (IMHO) unsuitable for an enterprise distro. (Some of
> > > these previously discussed in linux-mm [1].)
> > > 2) zcache is truly mm (memory management) code and the fact that
> > > it is in drivers at all was purely for logistical reasons
> > > (e.g. the only in-tree "staging" is in the drivers directory).
> > > My rewrite promotes it to (a subdirectory of) mm where IMHO it
> > > belongs.
> > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves
> > > this. My soon-to-be-post also places the re-factored ramster
> > > in mm, though with some minor work zcache could go in mm and
> > > ramster could stay in staging.
> > >
> > > Let's have this discussion, but unless the community decides
> > > otherwise, please consider this a NACK.
>
> Hi Konrad --
>
> > Hold on, that is rather unfair. The zcache has been in staging
> > for quite some time - your code has not been posted. Part of
> > "unstaging" a driver is for folks to review the code - and you
> > just said "No, mine is better" without showing your goods.
>
> Sorry, I'm not trying to be unfair. However, I don't see the point
> of promoting zcache out of staging unless it is intended to be used
> by real users in a real distro. There's been a lot of discussion,
> onlist and offlist, about what needs to be fixed in zcache and not
> much visible progress on fixing it. But fixing it is where I've spent
> most of my time over the last couple of months.
>
> If IBM or some other company or distro is eager to ship and support
> zcache in its current form, I agree that "promote now, improve later"
> is a fine approach. But promoting zcache out of staging simply because
> there is urgency to promote zsmalloc+zram out of staging doesn't
> seem wise. At a minimum, it distracts reviewers/effort from what IMHO
> is required to turn zcache into an enterprise-ready kernel feature.
>
> I can post my "goods" anytime. In its current form it is better
> than the zcache in staging (and, please remember, I wrote both so
> I think I am in a good position to compare the two).
> I have been waiting until I think the new zcache is feature complete
> before asking for review, especially since the newest features
> should demonstrate clearly why the rewrite is necessary and
> beneficial. But I can post* my current bits if people don't
> believe they exist and/or don't mind reviewing non-final code.
> (* Or I can put them in a publicly available git tree.)
>
> > There is a third option - which is to continue the promotion
> > of zcache from staging, get reviews, work on them ,etc, and
> > alongside of that you can work on fixing up (or ripping out)
> > zcache1 with zcache2 components as they make sense. Or even
> > having two of them - an enterprise and an embedded version
> > that will eventually get merged together. There is nothing
> > wrong with modifying a driver once it has left staging.
>
> Minchan and Seth can correct me if I am wrong, but I believe
> zram+zsmalloc, not zcache, is the target solution for embedded.

NOT ture. Some embedded devices use zcache but it's not original
zcache but modificated one.
Anyway, although embedded people use modified zcache, I am biased to Dan.
I admit I don't spend lots of time to look zcache but as looking the
code, it wasn't good shape and even had a bug found during code review
and I felt strongly we should clean up it for promoting it to mm/.
So I would like to wait Dan's posting if you guys are not urgent.
(And I am not sure akpm allow it with current shape of zcache code.)
But the concern is about adding new feature. I guess there might be some
debate for long time and it can prevent promoting again.
I think It's not what Seth want.
I hope Dan doesn't mix clean up series and new feature series and
post clean up series as soon as possible so let's clean up first and
try to promote it and later, adding new feature or changing algorithm
is desirable.


> The limitations of zsmalloc aren't an issue for zram but they are
> for zcache, and this deficiency was one of the catalysts for the
> rewrite. The issues are explained in more detail in [1],
> but if any point isn't clear, I'd be happy to explain further.
>
> However, I have limited time for this right now and I'd prefer
> to spend it finishing the code. :-}
>
> So, as I said, I am still a NACK, but if there are good reasons
> to duplicate effort and pursue the "third option", let's discuss
> them.
>
> Thanks,
> Dan
>
> [1] http://marc.info/?t=133886706700002&r=1&w=2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo [at] kvack For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

Jul 28, 2012, 7:20 PM

Post #6 of 36 (968 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

Hi Seth,

zcache out of staging is rather controversial as you see this thread.
But I believe zram is very mature and code/comment is clean. In addition,
it has lots of real customers in embedded side so IMHO, it would be easy to
promote it firstly. Of course, it will promote zsmalloc which is half on
what you want. What do you think about? If you agree, could you do that firstly?
If you don't want and promoting zcache continue to be controversial,
I will do that after my vacation.

Thanks.

On Fri, Jul 27, 2012 at 01:18:33PM -0500, Seth Jennings wrote:
> zcache is the remaining piece of code required to support in-kernel
> memory compression. The other two features, cleancache and frontswap,
> have been promoted to mainline in 3.0 and 3.5. This patchset
> promotes zcache from the staging tree to mainline.
>
> Based on the level of activity and contributions we're seeing from a
> diverse set of people and interests, I think zcache has matured to the
> point where it makes sense to promote this out of staging.
>
> Overview
> ========
> zcache is a backend to frontswap and cleancache that accepts pages from
> those mechanisms and compresses them, leading to reduced I/O caused by
> swap and file re-reads. This is very valuable in shared storage situations
> to reduce load on things like SANs. Also, in the case of slow backing/swap
> devices, zcache can also yield a performance gain.
>
> In-Kernel Memory Compression Overview:
>
> swap subsystem page cache
> + +
> frontswap cleancache
> + +
> zcache frontswap glue zcache cleancache glue
> + +
> +---------+------------+
> +
> zcache/tmem core
> +
> +---------+------------+
> + +
> zsmalloc zbud
>
> Everything below the frontswap/cleancache layer is current inside the
> zcache driver expect for zsmalloc which is a shared between zcache and
> another memory compression driver, zram.
>
> Since zcache is dependent on zsmalloc, it is also being promoted by this
> patchset.
>
> For information on zsmalloc and the rationale behind it's design and use
> cases verses already existing allocators in the kernel:
>
> https://lkml.org/lkml/2012/1/9/386
>
> zsmalloc is the allocator used by zcache to store persistent pages that
> comes from frontswap, as opposed to zbud which is the (internal) allocator
> used for ephemeral pages from cleancache.
>
> zsmalloc uses many fields of the page struct to create it's conceptual
> high-order page called a zspage. Exactly which fields are used and for
> what purpose is documented at the top of the zsmalloc .c file. Because
> zsmalloc uses struct page extensively, Andrew advised that the
> promotion location be mm/:
>
> https://lkml.org/lkml/2012/1/20/308
>
> Some benchmarking numbers demonstrating the I/O saving that can be had
> with zcache:
>
> https://lkml.org/lkml/2012/3/22/383
>
> Dan's presentation at LSF/MM this year on zcache:
>
> http://oss.oracle.com/projects/tmem/dist/documentation/presentations/LSFMM12-zcache-final.pdf
>
> This patchset is based on next-20120727 + 3-part zsmalloc patchset below
>
> https://lkml.org/lkml/2012/7/18/353
>
> The zsmalloc patchset is already acked and will be integrated by Greg after
> 3.6-rc1 is out.
>
> Seth Jennings (4):
> zsmalloc: collapse internal .h into .c
> zsmalloc: promote to mm/
> drivers: add memory management driver class
> zcache: promote to drivers/mm/
>
> drivers/Kconfig | 2 +
> drivers/Makefile | 1 +
> drivers/mm/Kconfig | 13 ++
> drivers/mm/Makefile | 1 +
> drivers/{staging => mm}/zcache/Makefile | 0
> drivers/{staging => mm}/zcache/tmem.c | 0
> drivers/{staging => mm}/zcache/tmem.h | 0
> drivers/{staging => mm}/zcache/zcache-main.c | 4 +-
> drivers/staging/Kconfig | 4 -
> drivers/staging/Makefile | 2 -
> drivers/staging/zcache/Kconfig | 11 --
> drivers/staging/zram/zram_drv.h | 3 +-
> drivers/staging/zsmalloc/Kconfig | 10 --
> drivers/staging/zsmalloc/Makefile | 3 -
> drivers/staging/zsmalloc/zsmalloc_int.h | 149 --------------------
> .../staging/zsmalloc => include/linux}/zsmalloc.h | 0
> mm/Kconfig | 18 +++
> mm/Makefile | 1 +
> .../zsmalloc/zsmalloc-main.c => mm/zsmalloc.c | 133 ++++++++++++++++-
> 19 files changed, 170 insertions(+), 185 deletions(-)
> create mode 100644 drivers/mm/Kconfig
> create mode 100644 drivers/mm/Makefile
> rename drivers/{staging => mm}/zcache/Makefile (100%)
> rename drivers/{staging => mm}/zcache/tmem.c (100%)
> rename drivers/{staging => mm}/zcache/tmem.h (100%)
> rename drivers/{staging => mm}/zcache/zcache-main.c (99%)
> delete mode 100644 drivers/staging/zcache/Kconfig
> delete mode 100644 drivers/staging/zsmalloc/Kconfig
> delete mode 100644 drivers/staging/zsmalloc/Makefile
> delete mode 100644 drivers/staging/zsmalloc/zsmalloc_int.h
> rename {drivers/staging/zsmalloc => include/linux}/zsmalloc.h (100%)
> rename drivers/staging/zsmalloc/zsmalloc-main.c => mm/zsmalloc.c (86%)
>
> --
> 1.7.9.5
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo [at] kvack For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


sjenning at linux

Jul 30, 2012, 12:19 PM

Post #7 of 36 (959 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

Dan,

I started writing inline responses to each concern but that
was adding more confusion than clarity. I would like to
focus the discussion.

The purpose of this patchset is to discuss the inclusion of
zcache into mainline during the 3.7 merge window. zcache
has been a staging since v2.6.39 and has been maturing with
contributions from 15 developers (5 with multiple commits)
working on improvements and bug fixes.

I want good code in the kernel, so if there are particular
areas that need attention before it's of acceptable quality
for mainline we need that discussion. I am eager to have
customers using memory compression with zcache but before
that I want to see zcache in mainline.

We agree with Konrad that zcache should be promoted before
additional features are included. Greg has also expressed
that he would like promotion before attempting to add
additional features [1]. Including new features now, while
in the staging tree, adds to the complexity and difficultly
of reverifying zcache and getting it accepted into mainline.

[1] https://lkml.org/lkml/2012/3/16/472

Let's have this discussion. If there are specific issues
that need to be addressed to get this ready for mainline
let's take them one-by-one and line-by-line with patches.

Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Jul 30, 2012, 1:48 PM

Post #8 of 36 (959 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> From: Seth Jennings [mailto:sjenning [at] linux]
> Subject: Re: [PATCH 0/4] promote zcache from staging
>
> Dan,
>
> I started writing inline responses to each concern but that
> was adding more confusion than clarity. I would like to
> focus the discussion.
> :
> Let's have this discussion. If there are specific issues
> that need to be addressed to get this ready for mainline
> let's take them one-by-one and line-by-line with patches.

Hi Seth --

Thanks for your response and for your passion.

The first discussion I think is about whether zsmalloc is
a suitable allocator for zcache. In its current state
in staging, zcache uses zbud for ephemeral (cleancache)
zpages and zsmalloc for persistent (frontswap) zpages.
I have proposed concerns on-list that the capabilities
provided by zsmalloc are not suitable for supporting zcache
in an enterprise distro. The author of zsmalloc concurred
and has (at least so far) not been available to enhance
zsmalloc, and you have taken a strong position that zsmalloc
needed to be "generic" (i.e. will never deliver the functionality
IMHO is necessary for zcache). So I have rewritten zbud to
handle both kinds of zpages and, at the same time, to
resolve my stated issues. This is the bulk of my
major rewrite... I don't think constructing and reviewing
a long series of one-by-one and line-by-line patches is
of much value here, especially since the current code is
in staging. We either (1) use the now rewritten zbud (2) wait
until someone rewrites zsmalloc (3) accept the deficiencies
of zcache in its current form.

The second discussion is whether ramster, as a "user" of
zcache, is relevant. As you know, ramster is built on
top of zcache but requires a fair number of significant
changes that, due to gregkh's restriction, could not be
made directly to zcache while in staging. In my rewrite,
I've taken a great deal of care that the "new" zcache
cleanly supports both. While some couldn't care less about
ramster, the next step of ramster may be of more interest
to a broader part of the community. So I am eager to
ensure that the core zcache code in zcache and ramster
doesn't need to "fork" again. The zcache-main.c in staging/ramster
is farther along than the zcache-main.c in staging/zcache, but
IMHO my rewrite is better and cleaner than either.

Most of the rest of the cleanup, such as converting to debugfs
instead of sysfs, could be done as a sequence of one-by-one
and line-by-line patches. I think we agree that zcache will
not be promoted unless this change is made, but IMHO constructing
and reviewing patches individually is not of much value since
the above zbud and ramster changes already result in a major
rewrite. I think the community would benefit most from a new
solid code foundation for zcache and reviewers time (and your
time and mine) would best be spent grokking the new code than
from reviewing a very long sequence of cleanup patches.

> The purpose of this patchset is to discuss the inclusion of
> zcache into mainline during the 3.7 merge window. zcache
> has been a staging since v2.6.39 and has been maturing with
> contributions from 15 developers (5 with multiple commits)
> working on improvements and bug fixes.
>
> I want good code in the kernel, so if there are particular
> areas that need attention before it's of acceptable quality
> for mainline we need that discussion. I am eager to have
> customers using memory compression with zcache but before
> that I want to see zcache in mainline.

I think we are all eager to achieve the end result: real users
using zcache in real production systems. IMHO your suggested
path will not achieve that, certainly not in the 3.7 timeframe.
The current code (IMHO) is neither suitable for promotion, nor
functionally capable of taking the beating of an enterprise distro.

> We agree with Konrad that zcache should be promoted before
> additional features are included. Greg has also expressed
> that he would like promotion before attempting to add
> additional features [1]. Including new features now, while
> in the staging tree, adds to the complexity and difficultly
> of reverifying zcache and getting it accepted into mainline.
>
> [1] https://lkml.org/lkml/2012/3/16/472still in staging.

Zcache as submitted to staging in 2.6.39 was (and is) a working
proof-of-concept. As you know, Greg's position created a
"catch 22"... zcache in its current state isn't good enough
to be promoted, but we can't change it substantially to resolve
its deficiencies while it is still in staging. (Minchan
recently stated that he doesn't think it is in good enough
shape to be approved by Andrew, and I agree.) That's why I
embarked on the rewrite.

Lastly, I'm not so much "adding new features" as ensuring the
new zcache foundation will be sufficient to support enterprise
users. But I do now agree with Minchan (and I think with you)
that I need to post where I'm at, even if I am not 100% ready or
satisfied. I'll try to do that by the end of the week.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


konrad.wilk at oracle

Jul 31, 2012, 8:36 AM

Post #9 of 36 (952 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Sun, Jul 29, 2012 at 10:54:28AM +0900, Minchan Kim wrote:
> On Fri, Jul 27, 2012 at 02:42:14PM -0700, Dan Magenheimer wrote:
> > > From: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
> > > Sent: Friday, July 27, 2012 3:00 PM
> > > Subject: Re: [PATCH 0/4] promote zcache from staging
> > >
> > > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote:
> > > > > From: Seth Jennings [mailto:sjenning [at] linux]
> > > > > Subject: [PATCH 0/4] promote zcache from staging
> > > > >
> > > > > zcache is the remaining piece of code required to support in-kernel
> > > > > memory compression. The other two features, cleancache and frontswap,
> > > > > have been promoted to mainline in 3.0 and 3.5. This patchset
> > > > > promotes zcache from the staging tree to mainline.
> > > > >
> > > > > Based on the level of activity and contributions we're seeing from a
> > > > > diverse set of people and interests, I think zcache has matured to the
> > > > > point where it makes sense to promote this out of staging.
> > > >
> > > > Hi Seth --
> > > >
> > > > Per offline communication, I'd like to see this delayed for three
> > > > reasons:
> > > >
> > > > 1) I've completely rewritten zcache and will post the rewrite soon.
> > > > The redesigned code fixes many of the weaknesses in zcache that
> > > > makes it (IMHO) unsuitable for an enterprise distro. (Some of
> > > > these previously discussed in linux-mm [1].)
> > > > 2) zcache is truly mm (memory management) code and the fact that
> > > > it is in drivers at all was purely for logistical reasons
> > > > (e.g. the only in-tree "staging" is in the drivers directory).
> > > > My rewrite promotes it to (a subdirectory of) mm where IMHO it
> > > > belongs.
> > > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves
> > > > this. My soon-to-be-post also places the re-factored ramster
> > > > in mm, though with some minor work zcache could go in mm and
> > > > ramster could stay in staging.
> > > >
> > > > Let's have this discussion, but unless the community decides
> > > > otherwise, please consider this a NACK.
> >
> > Hi Konrad --
> >
> > > Hold on, that is rather unfair. The zcache has been in staging
> > > for quite some time - your code has not been posted. Part of
> > > "unstaging" a driver is for folks to review the code - and you
> > > just said "No, mine is better" without showing your goods.
> >
> > Sorry, I'm not trying to be unfair. However, I don't see the point
> > of promoting zcache out of staging unless it is intended to be used
> > by real users in a real distro. There's been a lot of discussion,
> > onlist and offlist, about what needs to be fixed in zcache and not
> > much visible progress on fixing it. But fixing it is where I've spent
> > most of my time over the last couple of months.
> >
> > If IBM or some other company or distro is eager to ship and support
> > zcache in its current form, I agree that "promote now, improve later"
> > is a fine approach. But promoting zcache out of staging simply because
> > there is urgency to promote zsmalloc+zram out of staging doesn't
> > seem wise. At a minimum, it distracts reviewers/effort from what IMHO
> > is required to turn zcache into an enterprise-ready kernel feature.
> >
> > I can post my "goods" anytime. In its current form it is better
> > than the zcache in staging (and, please remember, I wrote both so
> > I think I am in a good position to compare the two).
> > I have been waiting until I think the new zcache is feature complete
> > before asking for review, especially since the newest features
> > should demonstrate clearly why the rewrite is necessary and
> > beneficial. But I can post* my current bits if people don't
> > believe they exist and/or don't mind reviewing non-final code.
> > (* Or I can put them in a publicly available git tree.)
> >
> > > There is a third option - which is to continue the promotion
> > > of zcache from staging, get reviews, work on them ,etc, and
> > > alongside of that you can work on fixing up (or ripping out)
> > > zcache1 with zcache2 components as they make sense. Or even
> > > having two of them - an enterprise and an embedded version
> > > that will eventually get merged together. There is nothing
> > > wrong with modifying a driver once it has left staging.
> >
> > Minchan and Seth can correct me if I am wrong, but I believe
> > zram+zsmalloc, not zcache, is the target solution for embedded.
>
> NOT ture. Some embedded devices use zcache but it's not original
> zcache but modificated one.

What kind of modifications? Would it make sense to post the patches
for those modifications?

> Anyway, although embedded people use modified zcache, I am biased to Dan.
> I admit I don't spend lots of time to look zcache but as looking the
> code, it wasn't good shape and even had a bug found during code review
> and I felt strongly we should clean up it for promoting it to mm/.

Do you recall what the bugs where?

> So I would like to wait Dan's posting if you guys are not urgent.
> (And I am not sure akpm allow it with current shape of zcache code.)
> But the concern is about adding new feature. I guess there might be some
> debate for long time and it can prevent promoting again.
> I think It's not what Seth want.
> I hope Dan doesn't mix clean up series and new feature series and
> post clean up series as soon as possible so let's clean up first and
> try to promote it and later, adding new feature or changing algorithm
> is desirable.
>
>
> > The limitations of zsmalloc aren't an issue for zram but they are
> > for zcache, and this deficiency was one of the catalysts for the
> > rewrite. The issues are explained in more detail in [1],
> > but if any point isn't clear, I'd be happy to explain further.
> >
> > However, I have limited time for this right now and I'd prefer
> > to spend it finishing the code. :-}
> >
> > So, as I said, I am still a NACK, but if there are good reasons
> > to duplicate effort and pursue the "third option", let's discuss
> > them.
> >
> > Thanks,
> > Dan
> >
> > [1] http://marc.info/?t=133886706700002&r=1&w=2
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo [at] kvack For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>
>
> --
> Kind regards,
> Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


konrad.wilk at oracle

Jul 31, 2012, 8:58 AM

Post #10 of 36 (957 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Mon, Jul 30, 2012 at 01:48:29PM -0700, Dan Magenheimer wrote:
> > From: Seth Jennings [mailto:sjenning [at] linux]
> > Subject: Re: [PATCH 0/4] promote zcache from staging
> >
> > Dan,
> >
> > I started writing inline responses to each concern but that
> > was adding more confusion than clarity. I would like to
> > focus the discussion.
> > :
> > Let's have this discussion. If there are specific issues
> > that need to be addressed to get this ready for mainline
> > let's take them one-by-one and line-by-line with patches.
>
> Hi Seth --
>
> Thanks for your response and for your passion.
>
> The first discussion I think is about whether zsmalloc is
> a suitable allocator for zcache. In its current state
> in staging, zcache uses zbud for ephemeral (cleancache)
> zpages and zsmalloc for persistent (frontswap) zpages.

OK, but - unstaging 'zsmalloc' is a different patchset.

> I have proposed concerns on-list that the capabilities
> provided by zsmalloc are not suitable for supporting zcache
> in an enterprise distro. The author of zsmalloc concurred

The goal is to support _both_ enterprise and embedded.

But what you are saying sounds like it does not work in enterprise
environment - which is not my experience? If you are saying that
the code should not be integrated before it works in both classes
perfectly - well, then a lot of other code in the Linux should not
have been accepted - and I would call those insufficienies "bugs".
And bugs are .. natural, albeit pesky.

I think what you are saing by "enterprise" is that you want
it to be enabled by default (or at least be confident that it
can be done so) for everybody and that it work quite well under
99% workload. While right now it covers only 98% (or some
other number) of workload.

> and has (at least so far) not been available to enhance
> zsmalloc, and you have taken a strong position that zsmalloc
> needed to be "generic" (i.e. will never deliver the functionality
> IMHO is necessary for zcache). So I have rewritten zbud to
> handle both kinds of zpages and, at the same time, to
> resolve my stated issues. This is the bulk of my

Ok, so zbud rewrite is to remove the need for zsmalloc
and use zbud2 for both persistent and ephemeral pages.

> major rewrite... I don't think constructing and reviewing
> a long series of one-by-one and line-by-line patches is
> of much value here, especially since the current code is
> in staging. We either (1) use the now rewritten zbud (2) wait
> until someone rewrites zsmalloc (3) accept the deficiencies
> of zcache in its current form.

This sounds like a Catch-22 :-) Greg would like to have the
TODO list finished - and it seems that one of the todo's
is to have one instead of two engines for dealing with pages.
But at the same time not adding in new features.

>
> The second discussion is whether ramster, as a "user" of
> zcache, is relevant. As you know, ramster is built on
> top of zcache but requires a fair number of significant
> changes that, due to gregkh's restriction, could not be
> made directly to zcache while in staging. In my rewrite,
> I've taken a great deal of care that the "new" zcache
> cleanly supports both. While some couldn't care less about
> ramster, the next step of ramster may be of more interest
> to a broader part of the community. So I am eager to
> ensure that the core zcache code in zcache and ramster
> doesn't need to "fork" again. The zcache-main.c in staging/ramster
> is farther along than the zcache-main.c in staging/zcache, but
> IMHO my rewrite is better and cleaner than either.

So in short you made zcache more modular?

>
> Most of the rest of the cleanup, such as converting to debugfs
> instead of sysfs, could be done as a sequence of one-by-one
> and line-by-line patches. I think we agree that zcache will

Sure. Thought you could do it more wholesale: sysfs->debugfs patch.

> not be promoted unless this change is made, but IMHO constructing
> and reviewing patches individually is not of much value since
> the above zbud and ramster changes already result in a major
> rewrite. I think the community would benefit most from a new
> solid code foundation for zcache and reviewers time (and your
> time and mine) would best be spent grokking the new code than
> from reviewing a very long sequence of cleanup patches.

You are ignoring the goodness of the testing and performance
numbers that zcache has gotten so far. With a new code those
numbers are invalidated. That is throwing away some good data.
Reviewing code based on the old code (and knowing how the
old code works) I think is easier than trying to understand new
code from scratch - at least one has a baseline to undertand it.
But that might be just my opinion - either way I am OK
looking at brand new code or old code - but I would end up
looking at the old code to answer those : "Huh. I wonder how
we did that previously." - at which point it might make sense
just to have the patches broken up in small segments.

>
> > The purpose of this patchset is to discuss the inclusion of
> > zcache into mainline during the 3.7 merge window. zcache
> > has been a staging since v2.6.39 and has been maturing with
> > contributions from 15 developers (5 with multiple commits)
> > working on improvements and bug fixes.
> >
> > I want good code in the kernel, so if there are particular
> > areas that need attention before it's of acceptable quality
> > for mainline we need that discussion. I am eager to have
> > customers using memory compression with zcache but before
> > that I want to see zcache in mainline.
>
> I think we are all eager to achieve the end result: real users
> using zcache in real production systems. IMHO your suggested
> path will not achieve that, certainly not in the 3.7 timeframe.
> The current code (IMHO) is neither suitable for promotion, nor
> functionally capable of taking the beating of an enterprise distro.

So we agree that it must be fixed, but disagree on how to fix it :-)

However there are real folks in the embedded env that use it (with
some modifications) - please don't call them "unreal users".

>
> > We agree with Konrad that zcache should be promoted before
> > additional features are included. Greg has also expressed
> > that he would like promotion before attempting to add
> > additional features [1]. Including new features now, while
> > in the staging tree, adds to the complexity and difficultly
> > of reverifying zcache and getting it accepted into mainline.
> >
> > [1] https://lkml.org/lkml/2012/3/16/472still in staging.
>
> Zcache as submitted to staging in 2.6.39 was (and is) a working
> proof-of-concept. As you know, Greg's position created a
> "catch 22"... zcache in its current state isn't good enough
> to be promoted, but we can't change it substantially to resolve
> its deficiencies while it is still in staging. (Minchan
> recently stated that he doesn't think it is in good enough
> shape to be approved by Andrew, and I agree.) That's why I
> embarked on the rewrite.
>
> Lastly, I'm not so much "adding new features" as ensuring the
> new zcache foundation will be sufficient to support enterprise
> users. But I do now agree with Minchan (and I think with you)
> that I need to post where I'm at, even if I am not 100% ready or
> satisfied. I'll try to do that by the end of the week.

So in my head I feel that it is Ok to:
1) address the concerns that zcache has before it is unstaged
2) rip out the two-engine system with a one-engine system
(and see how well it behaves)
3) sysfs->debugfs as needed
4) other things as needed

I think we are getting hung-up what Greg said about adding features
and the two-engine->one engine could be understood as that.
While I think that is part of a staging effort to clean up the
existing issues. Lets see what Greg thinks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gregkh at linuxfoundation

Jul 31, 2012, 9:19 AM

Post #11 of 36 (956 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote:
> So in my head I feel that it is Ok to:
> 1) address the concerns that zcache has before it is unstaged
> 2) rip out the two-engine system with a one-engine system
> (and see how well it behaves)
> 3) sysfs->debugfs as needed
> 4) other things as needed
>
> I think we are getting hung-up what Greg said about adding features
> and the two-engine->one engine could be understood as that.
> While I think that is part of a staging effort to clean up the
> existing issues. Lets see what Greg thinks.

Greg has no idea, except I want to see the needed fixups happen before
new features get added. Add the new features _after_ it is out of
staging.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


konrad.wilk at oracle

Jul 31, 2012, 10:51 AM

Post #12 of 36 (960 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Tue, Jul 31, 2012 at 09:19:16AM -0700, Greg Kroah-Hartman wrote:
> On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote:
> > So in my head I feel that it is Ok to:
> > 1) address the concerns that zcache has before it is unstaged
> > 2) rip out the two-engine system with a one-engine system
> > (and see how well it behaves)
> > 3) sysfs->debugfs as needed
> > 4) other things as needed
> >
> > I think we are getting hung-up what Greg said about adding features
> > and the two-engine->one engine could be understood as that.
> > While I think that is part of a staging effort to clean up the
> > existing issues. Lets see what Greg thinks.
>
> Greg has no idea, except I want to see the needed fixups happen before
> new features get added. Add the new features _after_ it is out of
> staging.

I think we (that is me, Seth, Minchan, Dan) need to talk to have a good
understanding of what each of us thinks are fixups.

Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work
for people?

>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


sjenning at linux

Jul 31, 2012, 11:19 AM

Post #13 of 36 (956 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On 07/31/2012 12:51 PM, Konrad Rzeszutek Wilk wrote:
> Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work
> for people?

I think this is a great idea!

Dan, can you post code as an RFC by tomorrow or Thursday?
We (Rob and I) have the Texas Linux Fest starting Friday.
We need time to review the code prior to chat so that we can
talk about specifics rather than generalities.

If that can be done, then we are available for the chat on
Monday.

Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

Aug 5, 2012, 5:38 PM

Post #14 of 36 (922 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

Hi Konrad,

On Tue, Jul 31, 2012 at 01:51:42PM -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Jul 31, 2012 at 09:19:16AM -0700, Greg Kroah-Hartman wrote:
> > On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote:
> > > So in my head I feel that it is Ok to:
> > > 1) address the concerns that zcache has before it is unstaged
> > > 2) rip out the two-engine system with a one-engine system
> > > (and see how well it behaves)
> > > 3) sysfs->debugfs as needed
> > > 4) other things as needed
> > >
> > > I think we are getting hung-up what Greg said about adding features
> > > and the two-engine->one engine could be understood as that.
> > > While I think that is part of a staging effort to clean up the
> > > existing issues. Lets see what Greg thinks.
> >
> > Greg has no idea, except I want to see the needed fixups happen before
> > new features get added. Add the new features _after_ it is out of
> > staging.
>
> I think we (that is me, Seth, Minchan, Dan) need to talk to have a good
> understanding of what each of us thinks are fixups.
>
> Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work
> for people?

1pm EST is 2am KST(Korea Standard Time) so it's not good for me. :)
I know it's hard to adjust my time for yours so let you talk without
me. Instead, I will write it down my requirement. It's very simple and
trivial.

1) Please don't add any new feature like replace zsmalloc with zbud.
It's totally untested so it needs more time for stable POV bug,
or performance/fragementation.

2) Factor out common code between zcache and ramster. It should be just
clean up code and should not change current behavior.

3) Add lots of comment to public functions

4) make function/varabiel names more clearly.

They are necessary for promotion and after promotion,
let's talk about new great features.


>
> >
> > greg k-h
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo [at] kvack For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

Aug 5, 2012, 9:49 PM

Post #15 of 36 (918 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Tue, Jul 31, 2012 at 11:36:04AM -0400, Konrad Rzeszutek Wilk wrote:
> On Sun, Jul 29, 2012 at 10:54:28AM +0900, Minchan Kim wrote:
> > On Fri, Jul 27, 2012 at 02:42:14PM -0700, Dan Magenheimer wrote:
> > > > From: Konrad Rzeszutek Wilk [mailto:konrad [at] darnok]
> > > > Sent: Friday, July 27, 2012 3:00 PM
> > > > Subject: Re: [PATCH 0/4] promote zcache from staging
> > > >
> > > > On Fri, Jul 27, 2012 at 12:21:50PM -0700, Dan Magenheimer wrote:
> > > > > > From: Seth Jennings [mailto:sjenning [at] linux]
> > > > > > Subject: [PATCH 0/4] promote zcache from staging
> > > > > >
> > > > > > zcache is the remaining piece of code required to support in-kernel
> > > > > > memory compression. The other two features, cleancache and frontswap,
> > > > > > have been promoted to mainline in 3.0 and 3.5. This patchset
> > > > > > promotes zcache from the staging tree to mainline.
> > > > > >
> > > > > > Based on the level of activity and contributions we're seeing from a
> > > > > > diverse set of people and interests, I think zcache has matured to the
> > > > > > point where it makes sense to promote this out of staging.
> > > > >
> > > > > Hi Seth --
> > > > >
> > > > > Per offline communication, I'd like to see this delayed for three
> > > > > reasons:
> > > > >
> > > > > 1) I've completely rewritten zcache and will post the rewrite soon.
> > > > > The redesigned code fixes many of the weaknesses in zcache that
> > > > > makes it (IMHO) unsuitable for an enterprise distro. (Some of
> > > > > these previously discussed in linux-mm [1].)
> > > > > 2) zcache is truly mm (memory management) code and the fact that
> > > > > it is in drivers at all was purely for logistical reasons
> > > > > (e.g. the only in-tree "staging" is in the drivers directory).
> > > > > My rewrite promotes it to (a subdirectory of) mm where IMHO it
> > > > > belongs.
> > > > > 3) Ramster heavily duplicates code from zcache. My rewrite resolves
> > > > > this. My soon-to-be-post also places the re-factored ramster
> > > > > in mm, though with some minor work zcache could go in mm and
> > > > > ramster could stay in staging.
> > > > >
> > > > > Let's have this discussion, but unless the community decides
> > > > > otherwise, please consider this a NACK.
> > >
> > > Hi Konrad --
> > >
> > > > Hold on, that is rather unfair. The zcache has been in staging
> > > > for quite some time - your code has not been posted. Part of
> > > > "unstaging" a driver is for folks to review the code - and you
> > > > just said "No, mine is better" without showing your goods.
> > >
> > > Sorry, I'm not trying to be unfair. However, I don't see the point
> > > of promoting zcache out of staging unless it is intended to be used
> > > by real users in a real distro. There's been a lot of discussion,
> > > onlist and offlist, about what needs to be fixed in zcache and not
> > > much visible progress on fixing it. But fixing it is where I've spent
> > > most of my time over the last couple of months.
> > >
> > > If IBM or some other company or distro is eager to ship and support
> > > zcache in its current form, I agree that "promote now, improve later"
> > > is a fine approach. But promoting zcache out of staging simply because
> > > there is urgency to promote zsmalloc+zram out of staging doesn't
> > > seem wise. At a minimum, it distracts reviewers/effort from what IMHO
> > > is required to turn zcache into an enterprise-ready kernel feature.
> > >
> > > I can post my "goods" anytime. In its current form it is better
> > > than the zcache in staging (and, please remember, I wrote both so
> > > I think I am in a good position to compare the two).
> > > I have been waiting until I think the new zcache is feature complete
> > > before asking for review, especially since the newest features
> > > should demonstrate clearly why the rewrite is necessary and
> > > beneficial. But I can post* my current bits if people don't
> > > believe they exist and/or don't mind reviewing non-final code.
> > > (* Or I can put them in a publicly available git tree.)
> > >
> > > > There is a third option - which is to continue the promotion
> > > > of zcache from staging, get reviews, work on them ,etc, and
> > > > alongside of that you can work on fixing up (or ripping out)
> > > > zcache1 with zcache2 components as they make sense. Or even
> > > > having two of them - an enterprise and an embedded version
> > > > that will eventually get merged together. There is nothing
> > > > wrong with modifying a driver once it has left staging.
> > >
> > > Minchan and Seth can correct me if I am wrong, but I believe
> > > zram+zsmalloc, not zcache, is the target solution for embedded.
> >
> > NOT ture. Some embedded devices use zcache but it's not original
> > zcache but modificated one.
>
> What kind of modifications? Would it make sense to post the patches

It's for contiguos memory allocation.
For it, it uses only clencache, not frontswap so it could zap ephemeral pages
without latency for getting big contiguos memory.

> for those modifications?

It's another story so at the moment, let's not consider it.
After we got some cleanup, I will revisit it.

>
> > Anyway, although embedded people use modified zcache, I am biased to Dan.
> > I admit I don't spend lots of time to look zcache but as looking the
> > code, it wasn't good shape and even had a bug found during code review
> > and I felt strongly we should clean up it for promoting it to mm/.
>
> Do you recall what the bugs where?

From: Minchan Kim <minchan [at] kernel>
Date: Fri, 27 Jul 2012 10:10:31 +0900
Subject: [PATCH] zcache: initialize idr

!CONFIG_FRONTSWAP doesn't initialize idr.
This patch always initialize idr.

Signed-off-by: Minchan Kim <minchan [at] kernel>
---
drivers/staging/zcache/zcache-main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 564873f..a635ee2 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -968,7 +968,7 @@ static void zcache_put_pool(struct tmem_pool *pool)
atomic_dec(&cli->refcount);
}

-int zcache_new_client(uint16_t cli_id)
+static int zcache_new_client(uint16_t cli_id)
{
struct zcache_client *cli;
int ret = -1;
@@ -980,11 +980,11 @@ int zcache_new_client(uint16_t cli_id)
if (cli->zspool)
goto out;

+ idr_init(&cli->tmem_pools);
#ifdef CONFIG_FRONTSWAP
cli->zspool = zs_create_pool("zcache", ZCACHE_GFP_MASK);
if (cli->zspool == NULL)
goto out;
- idr_init(&cli->tmem_pools);
#endif
ret = 0;
out:
--
1.7.9.5


>
> > So I would like to wait Dan's posting if you guys are not urgent.
> > (And I am not sure akpm allow it with current shape of zcache code.)
> > But the concern is about adding new feature. I guess there might be some
> > debate for long time and it can prevent promoting again.
> > I think It's not what Seth want.
> > I hope Dan doesn't mix clean up series and new feature series and
> > post clean up series as soon as possible so let's clean up first and
> > try to promote it and later, adding new feature or changing algorithm
> > is desirable.
> >
> >
> > > The limitations of zsmalloc aren't an issue for zram but they are
> > > for zcache, and this deficiency was one of the catalysts for the
> > > rewrite. The issues are explained in more detail in [1],
> > > but if any point isn't clear, I'd be happy to explain further.
> > >
> > > However, I have limited time for this right now and I'd prefer
> > > to spend it finishing the code. :-}
> > >
> > > So, as I said, I am still a NACK, but if there are good reasons
> > > to duplicate effort and pursue the "third option", let's discuss
> > > them.
> > >
> > > Thanks,
> > > Dan
> > >
> > > [1] http://marc.info/?t=133886706700002&r=1&w=2
> > >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > > the body to majordomo [at] kvack For more info on Linux MM,
> > > see: http://www.linux-mm.org/ .
> > > Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>
> >
> > --
> > Kind regards,
> > Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo [at] kvack For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont [at] kvack"> email [at] kvack </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Aug 6, 2012, 8:24 AM

Post #16 of 36 (912 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> > I think we (that is me, Seth, Minchan, Dan) need to talk to have a good
> > understanding of what each of us thinks are fixups.
> >
> > Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work
> > for people?
>
> 1pm EST is 2am KST(Korea Standard Time) so it's not good for me. :)
> I know it's hard to adjust my time for yours so let you talk without
> me. Instead, I will write it down my requirement. It's very simple and
> trivial.
>
> 1) Please don't add any new feature like replace zsmalloc with zbud.
> It's totally untested so it needs more time for stable POV bug,
> or performance/fragementation.
>
> 2) Factor out common code between zcache and ramster. It should be just
> clean up code and should not change current behavior.
>
> 3) Add lots of comment to public functions
>
> 4) make function/varabiel names more clearly.
>
> They are necessary for promotion and after promotion,
> let's talk about new great features.

Hi Minchan --

I hope you had a great vacation!

Since we won't be able to discuss this by phone/irc, I guess I
need to reply here.

Let me first restate my opinion as author of zcache.

The zcache in staging is really a "demo" version. It was written 21
months ago (and went into staging 16 months ago) primarily to show,
at Andrew Morton's suggestion, that frontswap and cleancache had value
in a normal standalone kernel (i.e. virtualization not required). When
posted in early 2011 zcache was known to have some fundamental flaws in the design...
that's why it went into "staging". The "demo" version in staging still has
those flaws and the change from xvmalloc to zsmalloc makes one of those flaws
worse. These design flaws are now fixed in the new code base I posted last
week AND the new code base has improved factoring, comments and the code is
properly re-merged with the zcache "fork" in ramster.

We are not talking about new "features"... I have tried to back out the
new features from the new code base already posted and will post them separately.

So I think we have four choices:

A) Try to promote zcache as is. (Seth's proposal)
B) Clean up zcache with no new functionality. (Minchan's proposal above)
C) New code base (in mm/tmem/) after review. (Dan's proposal)
D) New code base but retrofit as a series of patches (Konrad's suggestion)

Minchan, if we go with your proposal (B) are you volunteering
to do the work? And if you do, doesn't it have the same issue
that it is also totally untested? And, since (B) doesn't solve the
fundamental design issues, are you volunteering to fix those next?
And, in the meantime, doesn't this mean we have THREE versions
of zcache?

IMHO, the fastest way to get the best zcache into the kernel and
to distros and users is to throw away the "demo" version, move forward
to a new solid well-designed zcache code base, and work together to
build on it. There's still a lot to do so I hope we can work together.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


penberg at kernel

Aug 6, 2012, 8:47 AM

Post #17 of 36 (914 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer
<dan.magenheimer [at] oracle> wrote:
> IMHO, the fastest way to get the best zcache into the kernel and
> to distros and users is to throw away the "demo" version, move forward
> to a new solid well-designed zcache code base, and work together to
> build on it. There's still a lot to do so I hope we can work together.

I'm not convinced it's the _fastest way_. You're effectively
invalidating all the work done under drivers/staging so you might end up
in review limbo with your shiny new code...

AFAICT, your best bet is to first clean up zcache under driver/staging
and get that promoted under mm/zcache.c. You can then move on to the
more controversial ramster and figure out where to put the clustering
code, etc.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Aug 6, 2012, 9:21 AM

Post #18 of 36 (906 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> From: Pekka Enberg [mailto:penberg [at] kernel]
> Subject: Re: [PATCH 0/4] promote zcache from staging
>
> On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer
> <dan.magenheimer [at] oracle> wrote:
> > IMHO, the fastest way to get the best zcache into the kernel and
> > to distros and users is to throw away the "demo" version, move forward
> > to a new solid well-designed zcache code base, and work together to
> > build on it. There's still a lot to do so I hope we can work together.
>
> I'm not convinced it's the _fastest way_.

<grin> I guess I meant "optimal", combining "fast" and "best".

> You're effectively
> invalidating all the work done under drivers/staging so you might end up
> in review limbo with your shiny new code...

Fixing the fundamental design flaws will sooner or later invalidate
most (or all) of the previous testing/work anyway, won't it? Since
any kernel built with staging is "tainted" already, I feel like now
is a better time to make a major design transition.

I suppose:

(E) replace "demo" zcache with new code base and keep it
in staging for another cycle

is another alternative, but I think gregkh has said no to that.

Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


gregkh at linuxfoundation

Aug 6, 2012, 9:29 AM

Post #19 of 36 (914 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Mon, Aug 06, 2012 at 09:21:22AM -0700, Dan Magenheimer wrote:
> I suppose:
>
> (E) replace "demo" zcache with new code base and keep it
> in staging for another cycle
>
> is another alternative, but I think gregkh has said no to that.

No I have not. If you all feel that the existing code needs to be
dropped and replaced with a totally new version, that's fine with me.
It's forward progress, which is all that I ask for.

Hope this helps,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Aug 6, 2012, 9:38 AM

Post #20 of 36 (907 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> From: Greg Kroah-Hartman [mailto:gregkh [at] linuxfoundation]
> Subject: Re: [PATCH 0/4] promote zcache from staging
>
> On Mon, Aug 06, 2012 at 09:21:22AM -0700, Dan Magenheimer wrote:
> > I suppose:
> >
> > (E) replace "demo" zcache with new code base and keep it
> > in staging for another cycle
> >
> > is another alternative, but I think gregkh has said no to that.
>
> No I have not. If you all feel that the existing code needs to be
> dropped and replaced with a totally new version, that's fine with me.
> It's forward progress, which is all that I ask for.
>
> Hope this helps,
> greg k-h

Hi Greg --

Cool! I guess I mistakenly assumed that your "no new features"
requirement also implied "no fixes of fundamental design flaws". :-)

Having option (E) should make it easier to decide the best
technical solution, separate from the promotion timing and "where
does it land" question.

We'll get back to you soon...

Thanks!
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


minchan at kernel

Aug 6, 2012, 5:44 PM

Post #21 of 36 (907 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

Hi Dan,

On Mon, Aug 06, 2012 at 09:21:22AM -0700, Dan Magenheimer wrote:
> > From: Pekka Enberg [mailto:penberg [at] kernel]
> > Subject: Re: [PATCH 0/4] promote zcache from staging
> >
> > On Mon, Aug 6, 2012 at 6:24 PM, Dan Magenheimer
> > <dan.magenheimer [at] oracle> wrote:
> > > IMHO, the fastest way to get the best zcache into the kernel and
> > > to distros and users is to throw away the "demo" version, move forward
> > > to a new solid well-designed zcache code base, and work together to
> > > build on it. There's still a lot to do so I hope we can work together.
> >
> > I'm not convinced it's the _fastest way_.
>
> <grin> I guess I meant "optimal", combining "fast" and "best".
>
> > You're effectively
> > invalidating all the work done under drivers/staging so you might end up
> > in review limbo with your shiny new code...
>
> Fixing the fundamental design flaws will sooner or later invalidate
> most (or all) of the previous testing/work anyway, won't it? Since
> any kernel built with staging is "tainted" already, I feel like now
> is a better time to make a major design transition.
>
> I suppose:
>
> (E) replace "demo" zcache with new code base and keep it
> in staging for another cycle

I go for (E). Please send your refactoring code as formal patch.
Thanks.

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


konrad.wilk at oracle

Aug 7, 2012, 12:28 PM

Post #22 of 36 (908 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On Mon, Aug 06, 2012 at 09:38:16AM +0900, Minchan Kim wrote:
> Hi Konrad,
>
> On Tue, Jul 31, 2012 at 01:51:42PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Tue, Jul 31, 2012 at 09:19:16AM -0700, Greg Kroah-Hartman wrote:
> > > On Tue, Jul 31, 2012 at 11:58:43AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > So in my head I feel that it is Ok to:
> > > > 1) address the concerns that zcache has before it is unstaged
> > > > 2) rip out the two-engine system with a one-engine system
> > > > (and see how well it behaves)
> > > > 3) sysfs->debugfs as needed
> > > > 4) other things as needed
> > > >
> > > > I think we are getting hung-up what Greg said about adding features
> > > > and the two-engine->one engine could be understood as that.
> > > > While I think that is part of a staging effort to clean up the
> > > > existing issues. Lets see what Greg thinks.
> > >
> > > Greg has no idea, except I want to see the needed fixups happen before
> > > new features get added. Add the new features _after_ it is out of
> > > staging.
> >
> > I think we (that is me, Seth, Minchan, Dan) need to talk to have a good
> > understanding of what each of us thinks are fixups.
> >
> > Would Monday Aug 6th at 1pm EST on irc.freenode.net channel #zcache work
> > for people?
>
> 1pm EST is 2am KST(Korea Standard Time) so it's not good for me. :)
> I know it's hard to adjust my time for yours so let you talk without
> me. Instead, I will write it down my requirement. It's very simple and
> trivial.

OK, Thank you.

We had a lengthy chat (full chat log attached). The summary was that
we all want to promote zcache (for various reasons), but we are hang up
whether we are OK unstaging it wherein it lowers the I/Os but potentially
not giving large performance increase (when doing 'make -jN') or that we
want both of those characteristics in. Little history: v3.3 had
swap readahead patches that made the amount of pages going in swap dramatically
decrease - as such the performance of zcache is not anymore amazing, but ok.

Seth and Robert (and I surmise Minchan too) are very interested in zcache
as its lowers the amount of I/Os but performance is secondary. Dan is interested
in having less I/Os and providing a performance boost with the such workloads as
'make -jN' - in short less I/Os and better performance. Dan would like both
before unstaging.

The action items that came out are:
- Seth posted some benchmarks - he is going to rerun them with v3.5
to see how it behaves in terms of performance (make -jN benchmark).
- Robert is going to take a swing at Minchan refactor and adding comments, etc
(But after we get over the hump of agreeing on the next step).
- Konrad to rummage in his mbox to find any other technical objections
that were raised on zcache earlier to make sure to address them.
- Once Seth is finished Konrad is going to take another swing
at driving this discussion - either via email, IRC or conference call.
Attachments: zcache-Aug6.log (23.6 KB)


sjenning at linux

Aug 7, 2012, 1:23 PM

Post #23 of 36 (905 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On 07/27/2012 01:18 PM, Seth Jennings wrote:
> Some benchmarking numbers demonstrating the I/O saving that can be had
> with zcache:
>
> https://lkml.org/lkml/2012/3/22/383

There was concern that kernel changes external to zcache since v3.3 may
have mitigated the benefit of zcache. So I re-ran my kernel building
benchmark and confirmed that zcache is still providing I/O and runtime
savings.

Gentoo w/ kernel v3.5 (frontswap only, cleancache disabled)
Quad-core i5-2500 @ 3.3GHz
512MB DDR3 1600MHz (limited with mem=512m on boot)
Filesystem and swap on 80GB HDD (about 58MB/s with hdparm -t)
majflt are major page faults reported by the time command
pswpin/out is the delta of pswpin/out from /proc/vmstat before and after
the make -jN

Mind the 512MB RAM vs 1GB in my previous results. This just reduces
the number of threads required to create memory pressure and removes some
of the context switching noise from the results.

I'm also using a single HDD instead of the RAID0 in my previous results.

Each run started with with:
swapoff -a
swapon -a
sync
echo 3 > /proc/sys/vm/drop_caches

I/O (in pages):
normal zcache change
N pswpin pswpout majflt I/O sum pswpin pswpout majflt I/O sum %I/O
4 0 2 2116 2118 0 0 2125 2125 0%
8 0 575 2244 2819 4 4 2219 2227 21%
12 2543 4038 3226 9807 1748 2519 3871 8138 17%
16 23926 47278 9426 80630 8252 15598 9372 33222 59%
20 50307 127797 15039 193143 20224 40634 17975 78833 59%

Runtime (in seconds):
N normal zcache %change
4 126 127 -1%
8 124 124 0%
12 131 133 -2%
16 189 156 17%
20 261 235 10%

%CPU utilization (out of 400% on 4 cpus)
N normal zcache %change
4 254 253 0%
8 261 263 -1%
12 250 248 1%
16 173 211 -22%
20 124 140 -13%

There is a sweet spot at 16 threads, where zcache is improving runtime by
17% and reducing I/O by 59% (185MB) using 22% more CPU.

Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dan.magenheimer at oracle

Aug 7, 2012, 2:47 PM

Post #24 of 36 (904 views)
Permalink
RE: [PATCH 0/4] promote zcache from staging [In reply to]

> From: Seth Jennings [mailto:sjenning [at] linux]
> Subject: Re: [PATCH 0/4] promote zcache from staging
>
> On 07/27/2012 01:18 PM, Seth Jennings wrote:
> > Some benchmarking numbers demonstrating the I/O saving that can be had
> > with zcache:
> >
> > https://lkml.org/lkml/2012/3/22/383
>
> There was concern that kernel changes external to zcache since v3.3 may
> have mitigated the benefit of zcache. So I re-ran my kernel building
> benchmark and confirmed that zcache is still providing I/O and runtime
> savings.

Hi Seth --

Thanks for re-running your tests. I have a couple of concerns and
hope that you, and other interested parties, will read all the
way through my lengthy response.

The zcache issues I have seen in recent kernels arise when zcache
gets "full". I notice your original published benchmarks [1] include
N=24, N=28, and N=32, but these updated results do not. Are you planning
on completing the runs? Second, I now see the numbers I originally
published for what I thought was the same benchmark as yours are actually
an order of magnitude larger (in sec) than yours. I didn't notice
this in March because we were focused on the percent improvement, not
the raw measurements. Since the hardware is highly similar, I suspect
it is not a hardware difference but instead that you are compiling
a much smaller kernel. In other words, your test case is much
smaller, and so exercises zcache much less. My test case compiles
a full enterprise kernel... what is yours doing?

IMHO, any cache in computer science needs to be measured both
when it is not-yet-full and when it is full. The "demo" zcache in
staging works very well before it is full and I think our benchmarking
in March and your re-run benchmarks demonstrate that. At LSFMM, Andrea
Arcangeli pointed out that zcache, for frontswap pages, has no "writeback"
capabilities and, when it is full, it simply rejects further attempts
to put data in its cache. He said this is unacceptable for KVM and I
agreed that it was a flaw that needed to be fixed before zcache should
be promoted. When I tested zcache for this, I found that not only was
he right, but that zcache could not be fixed without a major rewrite.

This is one of the "fundamental flaws" of the "demo" zcache, but the new
code base allows for this to be fixed.

A second flaw is that the "demo" zcache has no concept of LRU for
either cleancache or frontswap pages, or ability to reclaim pageframes
at all for frontswap pages. (And for cleancache, pageframe reclaim
is semi-random). As I've noted in other threads, this may be impossible
to implement/fix with zsmalloc, and zsmalloc's author Nitin Gupta has
agreed, but the new code base implements all of this with zbud. One
can argue that LRU is not a requirement for zcache, but a long history
of operating systems theory would suggest otherwise.

A third flaw is that the "demo" version has a very poor policy to
determine what pages are "admitted". The demo policy does take into
account the total RAM in the system, but not current memory load
conditions. The new code base IMHO does a better job but discussion
will be in a refereed presentation at the upcoming Plumber's meeting.
The fix for this flaw might be back-portable to the "demo" version
so is not a showstopper in the "demo" version, but fixing it is
not just a cosmetic fix.

I can add more issues to the list, but will stop here. IMHO
the "demo" zcache is not suitable for promotion from staging,
which is why I spent over two months generating a new code base.
I, perhaps more than anyone else, would like to see zcache used,
by default, by real distros and customers, but I think it is
premature to promote it, especially the old "demo" code.

I do realize, however, that this decision is not mine alone so
defer to the community to decide.

Dan

[1] https://lkml.org/lkml/2012/3/22/383
[2] http://lkml.indiana.edu/hypermail/linux/kernel/1203.2/02842.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


sjenning at linux

Aug 8, 2012, 9:29 AM

Post #25 of 36 (907 views)
Permalink
Re: [PATCH 0/4] promote zcache from staging [In reply to]

On 08/07/2012 04:47 PM, Dan Magenheimer wrote:
> I notice your original published benchmarks [1] include
> N=24, N=28, and N=32, but these updated results do not. Are you planning
> on completing the runs? Second, I now see the numbers I originally
> published for what I thought was the same benchmark as yours are actually
> an order of magnitude larger (in sec) than yours. I didn't notice
> this in March because we were focused on the percent improvement, not
> the raw measurements. Since the hardware is highly similar, I suspect
> it is not a hardware difference but instead that you are compiling
> a much smaller kernel. In other words, your test case is much
> smaller, and so exercises zcache much less. My test case compiles
> a full enterprise kernel... what is yours doing?

I am doing a minimal kernel build for my local hardware
configuration.

With the reduction in RAM, 1GB to 512MB, I didn't need to do
test runs with >20 threads to find the peak of the benefit
curve at 16 threads. Past that, zcache is saturated and I'd
just be burning up my disk. I'm already swapping out about
500MB (i.e. RAM size) in the 20 thread non-zcache case.

Also, I provide the magnitude numbers (pages, seconds) just
to show my source data. The %change numbers are the real
results as they remove build size as a factor.

> At LSFMM, Andrea
> Arcangeli pointed out that zcache, for frontswap pages, has no "writeback"
> capabilities and, when it is full, it simply rejects further attempts
> to put data in its cache. He said this is unacceptable for KVM and I
> agreed that it was a flaw that needed to be fixed before zcache should
> be promoted.

KVM (in-tree) is not a current user of zcache. While the
use cases of possible future zcache users should be
considered, I don't think they can be used to prevent promotion.

> A second flaw is that the "demo" zcache has no concept of LRU for
> either cleancache or frontswap pages, or ability to reclaim pageframes
> at all for frontswap pages.
...
>
> A third flaw is that the "demo" version has a very poor policy to
> determine what pages are "admitted".
...
>
> I can add more issues to the list, but will stop here.

All of the flaws you list do not prevent zcache from being
beneficial right now, as my results demonstrate. Therefore,
the flaws listed are really potential improvements and can
be done in mainline after promotion. Even if large changes
are required to make these improvements, they can be made in
mainline in an incremental and public way.

Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.