Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[patch] mm: sparsemem memory_present() memory corruption fix

 

 

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded


mingo at elte

Apr 15, 2008, 5:03 PM

Post #1 of 26 (362 views)
Permalink
[patch] mm: sparsemem memory_present() memory corruption fix

finally found it ... the patch below solves the sparsemem crash and the
testsystem boots up fine now:

mars:~> uname -a
Linux mars 2.6.25-rc9-sched-devel.git-x86-latest.git #985 SMP Wed Apr 16
01:37:37 CEST 2008 i686 i686 i386 GNU/Linux

yay! :-)

Ingo

ps. anyone who can correctly guess the method with which i found the
exact place that corrupted memory will get a free beer next time we
meet :-)

------------------------->
Subject: mm: sparsemem memory_present() memory corruption fix
From: Ingo Molnar <mingo [at] elte>
Date: Wed Apr 16 01:40:00 CEST 2008

fix memory corruption and crash on 32-bit x86 systems.

if a !PAE x86 kernel is booted on a 32-bit system with more than
4GB of RAM, then we call memory_present() with a start/end that
goes outside the scope of MAX_PHYSMEM_BITS.

that causes this loop to happily walk over the limit of the
sparse memory section map:

for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
unsigned long section = pfn_to_section_nr(pfn);
struct mem_section *ms;

sparse_index_init(section, nid);
set_section_nid(section, nid);

ms = __nr_to_section(section);
if (!ms->section_mem_map)
ms->section_mem_map = sparse_encode_early_nid(nid) |

'ms' will be out of bounds and we'll corrupt a small amount of memory by
encoding the node ID. Depending on what that memory is, we might crash,
misbehave or just not notice the bug.

the fix is to sanity check anything the architecture passes to sparsemem.

this bug seems to be rather old (as old as sparsemem support itself),
but the exact incarnation depended on random details like configs,
which made this bug more prominent in v2.6.25-to-be.

an additional enhancement might be to print a warning about ignored
or trimmed memory ranges.

Signed-off-by: Ingo Molnar <mingo [at] elte>
---
mm/sparse.c | 10 ++++++++++
1 file changed, 10 insertions(+)

Index: linux/mm/sparse.c
===================================================================
--- linux.orig/mm/sparse.c
+++ linux/mm/sparse.c
@@ -149,8 +149,18 @@ static inline int sparse_early_nid(struc
/* Record a memory area against a node. */
void __init memory_present(int nid, unsigned long start, unsigned long end)
{
+ unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
unsigned long pfn;

+ /*
+ * Sanity checks - do not allow an architecture to pass
+ * in larger pfns than the maximum scope of sparsemem:
+ */
+ if (start >= max_arch_pfn)
+ return;
+ if (end >= max_arch_pfn)
+ end = max_arch_pfn;
+
start &= PAGE_SECTION_MASK;
for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
unsigned long section = pfn_to_section_nr(pfn);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


clameter at sgi

Apr 15, 2008, 5:10 PM

Post #2 of 26 (348 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

Yes that fixes it here too. And the corruption that I saw of slab
variables is explained by your analysis. Thanks!

Tested-by: Christoph Lameter <clameter [at] sgi>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:18 PM

Post #3 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Ingo Molnar <mingo [at] elte> wrote:

> finally found it ... the patch below solves the sparsemem crash and
> the testsystem boots up fine now:
>
> mars:~> uname -a
> Linux mars 2.6.25-rc9-sched-devel.git-x86-latest.git #985 SMP Wed Apr 16
> 01:37:37 CEST 2008 i686 i686 i386 GNU/Linux

i re-checked the original SLAB config too and that boots fine as well
now - so i'm confident that the regression has been sufficiently cured.

it's getting quite late here (or rather, it's getting early :-/ ) so it
would be nice if others could double-check this calculation (with an eye
on all possible architectures):

+ unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);

and also check my analysis whether it is correct and whether it matches
the reported bug patterns. But otherwise the fix looks like a safe fix
for v2.6.25-final to me - it only filters out values from sparsemem
input that are nonsensical in the sparsemem framework anyway.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


clameter at sgi

Apr 15, 2008, 5:19 PM

Post #4 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Wed, 16 Apr 2008, Ingo Molnar wrote:

> if a !PAE x86 kernel is booted on a 32-bit system with more than
> 4GB of RAM, then we call memory_present() with a start/end that
> goes outside the scope of MAX_PHYSMEM_BITS.

Well okay this fixes it but is this the right fix? The arch should not
call memory_present() with an invalid pfn.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


yhlu.kernel at gmail

Apr 15, 2008, 5:32 PM

Post #5 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, Apr 15, 2008 at 5:18 PM, Ingo Molnar <mingo [at] elte> wrote:
>
> * Ingo Molnar <mingo [at] elte> wrote:
>
> > finally found it ... the patch below solves the sparsemem crash and
> > the testsystem boots up fine now:
> >
> > mars:~> uname -a
> > Linux mars 2.6.25-rc9-sched-devel.git-x86-latest.git #985 SMP Wed Apr 16
> > 01:37:37 CEST 2008 i686 i686 i386 GNU/Linux
>
> i re-checked the original SLAB config too and that boots fine as well
> now - so i'm confident that the regression has been sufficiently cured.
>
> it's getting quite late here (or rather, it's getting early :-/ ) so it
> would be nice if others could double-check this calculation (with an eye
> on all possible architectures):
>
> + unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
>
> and also check my analysis whether it is correct and whether it matches
> the reported bug patterns. But otherwise the fix looks like a safe fix
> for v2.6.25-final to me - it only filters out values from sparsemem
> input that are nonsensical in the sparsemem framework anyway.
>
> Ingo
>

can you check why find_max_pfn() e820_32.c need to call memory_present?
wonder if it can be removed.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


yhlu.kernel at gmail

Apr 15, 2008, 5:33 PM

Post #6 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, Apr 15, 2008 at 5:19 PM, Christoph Lameter <clameter [at] sgi> wrote:
> On Wed, 16 Apr 2008, Ingo Molnar wrote:
>
> > if a !PAE x86 kernel is booted on a 32-bit system with more than
> > 4GB of RAM, then we call memory_present() with a start/end that
> > goes outside the scope of MAX_PHYSMEM_BITS.
>
> Well okay this fixes it but is this the right fix? The arch should not
> call memory_present() with an invalid pfn.

yes in find_max_pfn...

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:34 PM

Post #7 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

small addendum to the changelog:

> if (!ms->section_mem_map)
> ms->section_mem_map = sparse_encode_early_nid(nid) |
+ | SECTION_MARKED_PRESENT
>
> 'ms' will be out of bounds and we'll corrupt a small amount of memory by
> encoding the node ID. Depending on what that memory is, we might crash,
> misbehave or just not notice the bug.

the corruption might happen when encoding a non-zero node ID, or due to
the SECTION_MARKED_PRESENT which is 0x1:

mmzone.h:#define SECTION_MARKED_PRESENT (1UL<<0)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:36 PM

Post #8 of 26 (347 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Christoph Lameter <clameter [at] sgi> wrote:

> On Wed, 16 Apr 2008, Ingo Molnar wrote:
>
> > if a !PAE x86 kernel is booted on a 32-bit system with more than 4GB
> > of RAM, then we call memory_present() with a start/end that goes
> > outside the scope of MAX_PHYSMEM_BITS.
>
> Well okay this fixes it but is this the right fix? The arch should not
> call memory_present() with an invalid pfn.

it is the right fix. The architecture memory setup code doesnt even
_know_ the limits at this place in an open-coded way (and shouldnt know
them) - and even later on we use pfn_valid() to determine whether to
attempt to get to a struct page and free it into the buddy.

[. Of course the architecture code in general 'knows' about the limits -
but still it's cleaner to have a dumb enumeration interface here
combined with a resilient core code - that's always going to be less
fragile. ]

btw., i just did some bug history analysis, the calls were originally
added when sparsemem support was added:

| commit 215c3409eed16c89b6d11ea1126bd9d4f36b9afd
| Author: Andy Whitcroft <apw [at] shadowen>
| Date: Fri Jan 6 00:12:06 2006 -0800
|
| [PATCH] i386 sparsemem for single node systems

in v2.6.15-1003-g215c340. (so this is appears to be an unfixed bug in
v2.6.16 as well)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:40 PM

Post #9 of 26 (348 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Ingo Molnar <mingo [at] elte> wrote:

> small addendum to the changelog:

Joe Perches pointed out that the ULL was superfluous (i typoed it, i
knew it's a pfn). Updated patch below.

Ingo

-------------------------->
Subject: mm: sparsemem memory_present() fix
From: Ingo Molnar <mingo [at] elte>
Date: Wed Apr 16 01:40:00 CEST 2008

fix memory corruption and crash on 32-bit x86 systems.

if a !PAE x86 kernel is booted on a 32-bit system with more than
4GB of RAM, then we call memory_present() with a start/end that
goes outside the scope of MAX_PHYSMEM_BITS.

that causes this loop to happily walk over the limit of the
sparse memory section map:

for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
unsigned long section = pfn_to_section_nr(pfn);
struct mem_section *ms;

sparse_index_init(section, nid);
set_section_nid(section, nid);

ms = __nr_to_section(section);
if (!ms->section_mem_map)
ms->section_mem_map = sparse_encode_early_nid(nid) |
SECTION_MARKED_PRESENT;

'ms' will be out of bounds and we'll corrupt a small amount of memory by
encoding the node ID and writing SECTION_MARKED_PRESENT (==0x1) over it.

the fix is to sanity check anything the architecture passes to sparsemem.

this bug seems to be rather old (as old as sparsemem support itself),
but the exact incarnation depended on random details like configs,
which made this bug more prominent in v2.6.25-to-be.

an additional enhancement might be to print a warning about ignored
or trimmed memory ranges.

Signed-off-by: Ingo Molnar <mingo [at] elte>
---
mm/sparse.c | 10 ++++++++++
1 file changed, 10 insertions(+)

Index: linux/mm/sparse.c
===================================================================
--- linux.orig/mm/sparse.c
+++ linux/mm/sparse.c
@@ -149,8 +149,18 @@ static inline int sparse_early_nid(struc
/* Record a memory area against a node. */
void __init memory_present(int nid, unsigned long start, unsigned long end)
{
+ unsigned long max_arch_pfn = 1UL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
unsigned long pfn;

+ /*
+ * Sanity checks - do not allow an architecture to pass
+ * in larger pfns than the maximum scope of sparsemem:
+ */
+ if (start >= max_arch_pfn)
+ return;
+ if (end >= max_arch_pfn)
+ end = max_arch_pfn;
+
start &= PAGE_SECTION_MASK;
for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
unsigned long section = pfn_to_section_nr(pfn);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:44 PM

Post #10 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Yinghai Lu <yhlu.kernel [at] gmail> wrote:

> > + unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
> >
> > and also check my analysis whether it is correct and whether it
> > matches the reported bug patterns. But otherwise the fix looks like
> > a safe fix for v2.6.25-final to me - it only filters out values
> > from sparsemem input that are nonsensical in the sparsemem
> > framework anyway.
>
> can you check why find_max_pfn() e820_32.c need to call
> memory_present? wonder if it can be removed.

this is the only call to memory_present() we do in 32-bit arch setup, so
it's required.

(the function find_max_pfn() is woefully misnamed, but that's a cleanup
- i just fixed this in x86.git.)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


clameter at sgi

Apr 15, 2008, 5:45 PM

Post #11 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

> fix memory corruption and crash on 32-bit x86 systems.
>
> if a !PAE x86 kernel is booted on a 32-bit system with more than
> 4GB of RAM, then we call memory_present() with a start/end that
> goes outside the scope of MAX_PHYSMEM_BITS.

So its a general issue that has been there for years that we are now
noticing because we are now testing with memory sizes > 4GB. This also
affects the enterprise releases (SLES10, RHEL5). Argh!

I wonder why this did not show up earlier in testing? Running a kernel
that cannot access all of memory is unusual I guess.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


clameter at sgi

Apr 15, 2008, 5:46 PM

Post #12 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Wed, 16 Apr 2008, Ingo Molnar wrote:

> this is the only call to memory_present() we do in 32-bit arch setup, so
> it's required.

We could clip there if SPARSEMEM is configured. I wonder if this affects
other platforms that need HIGHMEM support?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:52 PM

Post #13 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Christoph Lameter <clameter [at] sgi> wrote:

> I wonder why this did not show up earlier in testing? Running a kernel
> that cannot access all of memory is unusual I guess.

i guess people saw the "you are not running a PAE kernel" warning and
went to a PAE kernel which didnt have this issue.

OTOH, quite a few testers consciously use non-PAE kernels on 4GB
systems, so i'd not be surprised if this solved a few mystery
regressions we have.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 5:52 PM

Post #14 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Christoph Lameter <clameter [at] sgi> wrote:

> On Wed, 16 Apr 2008, Ingo Molnar wrote:
>
> > this is the only call to memory_present() we do in 32-bit arch
> > setup, so it's required.
>
> We could clip there if SPARSEMEM is configured. I wonder if this
> affects other platforms that need HIGHMEM support?

clip where and what?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


yhlu.kernel at gmail

Apr 15, 2008, 5:56 PM

Post #15 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, Apr 15, 2008 at 5:44 PM, Ingo Molnar <mingo [at] elte> wrote:
>
> * Yinghai Lu <yhlu.kernel [at] gmail> wrote:
>
> >
>
> > can you check why find_max_pfn() e820_32.c need to call
> > memory_present? wonder if it can be removed.
>
> this is the only call to memory_present() we do in 32-bit arch setup, so
> it's required.
>
> (the function find_max_pfn() is woefully misnamed, but that's a cleanup
> - i just fixed this in x86.git.)

64 bit is calling that via paging_init
==>sparse_memory_present_with_active_regions(MAX_NUMNODES).

and
void __init sparse_memory_present_with_active_regions(int nid)
{
int i;

for_each_active_range_index_in_nid(i, nid)
memory_present(early_node_map[i].nid,
early_node_map[i].start_pfn,
early_node_map[i].end_pfn);
}

that is some late than 32 bit.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 6:02 PM

Post #16 of 26 (348 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Yinghai Lu <yhlu.kernel [at] gmail> wrote:

> On Tue, Apr 15, 2008 at 5:44 PM, Ingo Molnar <mingo [at] elte> wrote:
> >
> > * Yinghai Lu <yhlu.kernel [at] gmail> wrote:
> >
> > > > + unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
> > > >
> > > > and also check my analysis whether it is correct and whether it
> > > > matches the reported bug patterns. But otherwise the fix looks like
> > > > a safe fix for v2.6.25-final to me - it only filters out values
> > > > from sparsemem input that are nonsensical in the sparsemem
> > > > framework anyway.
> > >
> >
> > > can you check why find_max_pfn() e820_32.c need to call
> > > memory_present? wonder if it can be removed.
> >
> > this is the only call to memory_present() we do in 32-bit arch setup, so
> > it's required.
> >
> > (the function find_max_pfn() is woefully misnamed, but that's a cleanup
> > - i just fixed this in x86.git.)
>
> 64 bit is calling that via paging_init
> ==>sparse_memory_present_with_active_regions(MAX_NUMNODES).
>
> and
> void __init sparse_memory_present_with_active_regions(int nid)

yeah - 64-bit is different here and it's not affected by the problem
because there SECTION_SIZE_BITS is 27 (==128 MB chunks),
MAX_PHYSADDR_BITS is 40 (== 1 TB) - giving 8192 section map entries.
Once larger than 1 TB 64-bit x86 systems are created MAX_PHYSADDR_BITS
needs to be increased.

The only downside of the current setup on 64-bit is that it wastes 128K
of RAM on the majority of systems. We could perhaps try a shift of 28,
which halves the footprint to 64K of RAM, and which still is good enough
to allow the PCI aperture to remain a hole on most systems. It would
also compress the data-cache footprint of the sparse memory maps.
(without having to use sparsemem-extreme indirection)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 6:14 PM

Post #17 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Ingo Molnar <mingo [at] elte> wrote:

> but the exact incarnation depended on random details like configs,
> which made this bug more prominent in v2.6.25-to-be.

i believe this was the reason why my many bisection attempts were
unsuccessful: the bug pattern was not stable and seemingly working
kernels had the memory corruption too. It was pure luck that v2.6.24
"worked" and v2.6.25-rc9 broke visibly.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mingo at elte

Apr 15, 2008, 6:17 PM

Post #18 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

* Ingo Molnar <mingo [at] elte> wrote:

> > > this is the only call to memory_present() we do in 32-bit arch
> > > setup, so it's required.
> >
> > We could clip there if SPARSEMEM is configured. I wonder if this
> > affects other platforms that need HIGHMEM support?
>
> clip where and what?

i.e. as per my previous argument i'd consider the need to sanitize the
calls in the architecture fundamentally wrong.

whether the core code emits a warning or allows the call is an
additional question i mention in the changelog - but the core sparse
memory code should _definitely_ not silently overflow a key internal
array ... (of which data structure the architecture code is not even
aware of)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


yhlu.kernel at gmail

Apr 15, 2008, 6:17 PM

Post #19 of 26 (346 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, Apr 15, 2008 at 6:02 PM, Ingo Molnar <mingo [at] elte> wrote:
>
> * Yinghai Lu <yhlu.kernel [at] gmail> wrote:
>
> > On Tue, Apr 15, 2008 at 5:44 PM, Ingo Molnar <mingo [at] elte> wrote:
> > >
> > > * Yinghai Lu <yhlu.kernel [at] gmail> wrote:
> > >
> > > > > + unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
> > > > >
> > > > > and also check my analysis whether it is correct and whether it
> > > > > matches the reported bug patterns. But otherwise the fix looks like
> > > > > a safe fix for v2.6.25-final to me - it only filters out values
> > > > > from sparsemem input that are nonsensical in the sparsemem
> > > > > framework anyway.
> > > >
> > >
> > > > can you check why find_max_pfn() e820_32.c need to call
> > > > memory_present? wonder if it can be removed.
> > >
> > > this is the only call to memory_present() we do in 32-bit arch setup, so
> > > it's required.
> > >
> > > (the function find_max_pfn() is woefully misnamed, but that's a cleanup
> > > - i just fixed this in x86.git.)
> >
> > 64 bit is calling that via paging_init
> > ==>sparse_memory_present_with_active_regions(MAX_NUMNODES).
> >
> > and
> > void __init sparse_memory_present_with_active_regions(int nid)
>
> yeah - 64-bit is different here and it's not affected by the problem
> because there SECTION_SIZE_BITS is 27 (==128 MB chunks),
> MAX_PHYSADDR_BITS is 40 (== 1 TB) - giving 8192 section map entries.
> Once larger than 1 TB 64-bit x86 systems are created MAX_PHYSADDR_BITS
> needs to be increased.

also 64 bit
early_node_map[10] active PFN ranges
0: 0 -> 149
0: 256 -> 917408
0: 1048576 -> 8519680
1: 8519680 -> 16908288
2: 16908288 -> 25296896
3: 25296896 -> 33685504
4: 33685504 -> 42074112
5: 42074112 -> 50462720
6: 50462720 -> 58851328
7: 58851328 -> 67239936

and 32 bit only has one entry
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0 -> 1048576

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


yhlu.kernel at gmail

Apr 15, 2008, 6:30 PM

Post #20 of 26 (345 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, Apr 15, 2008 at 6:17 PM, Ingo Molnar <mingo [at] elte> wrote:
>
> * Ingo Molnar <mingo [at] elte> wrote:
>
> > > > this is the only call to memory_present() we do in 32-bit arch
> > > > setup, so it's required.
> > >
> > > We could clip there if SPARSEMEM is configured. I wonder if this
> > > affects other platforms that need HIGHMEM support?
> >
> > clip where and what?
>
> i.e. as per my previous argument i'd consider the need to sanitize the
> calls in the architecture fundamentally wrong.
>
> whether the core code emits a warning or allows the call is an
> additional question i mention in the changelog - but the core sparse
> memory code should _definitely_ not silently overflow a key internal
> array ... (of which data structure the architecture code is not even
> aware of)

or you can move that check into find_max_pfn for x86_32? so it will
not affect other platform regarding Christoph's concern?

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kamezawa.hiroyu at jp

Apr 15, 2008, 6:48 PM

Post #21 of 26 (340 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

I'm sorry to be too late here..

On Wed, 16 Apr 2008 02:03:56 +0200
Ingo Molnar <mingo [at] elte> wrote:

> Signed-off-by: Ingo Molnar <mingo [at] elte>
> ---
> mm/sparse.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> Index: linux/mm/sparse.c
> ===================================================================
> --- linux.orig/mm/sparse.c
> +++ linux/mm/sparse.c
> @@ -149,8 +149,18 @@ static inline int sparse_early_nid(struc
> /* Record a memory area against a node. */
> void __init memory_present(int nid, unsigned long start, unsigned long end)
> {
> + unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
> unsigned long pfn;
>
how about

max_arch_pfn = NR_MEM_SECTIONS * PAGES_PER_SECTION.

?
Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


yhlu.kernel at gmail

Apr 15, 2008, 7:00 PM

Post #22 of 26 (340 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, Apr 15, 2008 at 6:30 PM, Yinghai Lu <yhlu.kernel [at] gmail> wrote:
>
> On Tue, Apr 15, 2008 at 6:17 PM, Ingo Molnar <mingo [at] elte> wrote:
> >
> > * Ingo Molnar <mingo [at] elte> wrote:
> >
> > > > > this is the only call to memory_present() we do in 32-bit arch
> > > > > setup, so it's required.
> > > >
> > > > We could clip there if SPARSEMEM is configured. I wonder if this
> > > > affects other platforms that need HIGHMEM support?
> > >
> > > clip where and what?
> >
> > i.e. as per my previous argument i'd consider the need to sanitize the
> > calls in the architecture fundamentally wrong.
> >
> > whether the core code emits a warning or allows the call is an
> > additional question i mention in the changelog - but the core sparse
> > memory code should _definitely_ not silently overflow a key internal
> > array ... (of which data structure the architecture code is not even
> > aware of)
>
> or you can move that check into find_max_pfn for x86_32? so it will
> not affect other platform regarding Christoph's concern?
>
the patch doesn't have side effects on x86_64.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


kamezawa.hiroyu at jp

Apr 15, 2008, 7:20 PM

Post #23 of 26 (342 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Tue, 15 Apr 2008 19:00:18 -0700
"Yinghai Lu" <yhlu.kernel [at] gmail> wrote:

> On Tue, Apr 15, 2008 at 6:30 PM, Yinghai Lu <yhlu.kernel [at] gmail> wrote:
> >
> > On Tue, Apr 15, 2008 at 6:17 PM, Ingo Molnar <mingo [at] elte> wrote:
> > >
> > > * Ingo Molnar <mingo [at] elte> wrote:
> > >
> > > > > > this is the only call to memory_present() we do in 32-bit arch
> > > > > > setup, so it's required.
> > > > >
> > > > > We could clip there if SPARSEMEM is configured. I wonder if this
> > > > > affects other platforms that need HIGHMEM support?
> > > >
> > > > clip where and what?
> > >
> > > i.e. as per my previous argument i'd consider the need to sanitize the
> > > calls in the architecture fundamentally wrong.
> > >
> > > whether the core code emits a warning or allows the call is an
> > > additional question i mention in the changelog - but the core sparse
> > > memory code should _definitely_ not silently overflow a key internal
> > > array ... (of which data structure the architecture code is not even
> > > aware of)
> >
> > or you can move that check into find_max_pfn for x86_32? so it will
> > not affect other platform regarding Christoph's concern?
> >
> the patch doesn't have side effects on x86_64.
>
also no side effects on my ia64/NUMA box, which has sparse physical memory map.

Thanks,
-Kame



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


torvalds at linux-foundation

Apr 15, 2008, 7:45 PM

Post #24 of 26 (340 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On Wed, 16 Apr 2008, Ingo Molnar wrote:
>
> small addendum to the changelog:

Ok, you didn't make that addendum to your second version, so I added it
myself.

Anyway, good job. I've pushed this out, and will let this simmer at least
overnight to see if there are any brown-paper-bag issues (either with this
or with some last changes from Andrew), but I'm happy, and I think I'll do
the real 2.6.25 tomorrow.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


mel at csn

Apr 16, 2008, 7:05 AM

Post #25 of 26 (333 views)
Permalink
Re: [patch] mm: sparsemem memory_present() memory corruption fix [In reply to]

On (16/04/08 02:03), Ingo Molnar didst pronounce:
>
> finally found it ... the patch below solves the sparsemem crash and the
> testsystem boots up fine now:
>
> mars:~> uname -a
> Linux mars 2.6.25-rc9-sched-devel.git-x86-latest.git #985 SMP Wed Apr 16
> 01:37:37 CEST 2008 i686 i686 i386 GNU/Linux
>
> yay! :-)
>

Very cool :) This fixed the silent lock-up that I was getting when using
your config as well.

At a bit of a loss yesterday to explain what was going wrong, I had started
putting together patches to sanity check memory initialisation at various
different stages trying to catch where things were going pear-shaped. You
found the bug before it was done but I finished the basics anyway and posted
it as "[RFC] Verification and debugging of memory initialisation". Something
like it may help avoid similar headaches for people who tend to run into
(or cause) boot problems.

> ps. anyone who can correctly guess the method with which i found the
> exact place that corrupted memory will get a free beer next time we
> meet :-)
>
> ------------------------->
> Subject: mm: sparsemem memory_present() memory corruption fix
> From: Ingo Molnar <mingo [at] elte>
> Date: Wed Apr 16 01:40:00 CEST 2008
>
> fix memory corruption and crash on 32-bit x86 systems.
>
> if a !PAE x86 kernel is booted on a 32-bit system with more than
> 4GB of RAM, then we call memory_present() with a start/end that
> goes outside the scope of MAX_PHYSMEM_BITS.
>
> that causes this loop to happily walk over the limit of the
> sparse memory section map:
>
> for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
> unsigned long section = pfn_to_section_nr(pfn);
> struct mem_section *ms;
>
> sparse_index_init(section, nid);
> set_section_nid(section, nid);
>
> ms = __nr_to_section(section);
> if (!ms->section_mem_map)
> ms->section_mem_map = sparse_encode_early_nid(nid) |
>
> 'ms' will be out of bounds and we'll corrupt a small amount of memory by
> encoding the node ID. Depending on what that memory is, we might crash,
> misbehave or just not notice the bug.
>
> the fix is to sanity check anything the architecture passes to sparsemem.
>
> this bug seems to be rather old (as old as sparsemem support itself),
> but the exact incarnation depended on random details like configs,
> which made this bug more prominent in v2.6.25-to-be.
>
> an additional enhancement might be to print a warning about ignored
> or trimmed memory ranges.
>
> Signed-off-by: Ingo Molnar <mingo [at] elte>
> ---
> mm/sparse.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> Index: linux/mm/sparse.c
> ===================================================================
> --- linux.orig/mm/sparse.c
> +++ linux/mm/sparse.c
> @@ -149,8 +149,18 @@ static inline int sparse_early_nid(struc
> /* Record a memory area against a node. */
> void __init memory_present(int nid, unsigned long start, unsigned long end)
> {
> + unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
> unsigned long pfn;
>
> + /*
> + * Sanity checks - do not allow an architecture to pass
> + * in larger pfns than the maximum scope of sparsemem:
> + */
> + if (start >= max_arch_pfn)
> + return;
> + if (end >= max_arch_pfn)
> + end = max_arch_pfn;
> +
> start &= PAGE_SECTION_MASK;
> for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
> unsigned long section = pfn_to_section_nr(pfn);
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First page Previous page 1 2 Next page Last page  View All Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.