Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux: Kernel

[PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize

 

 

Linux kernel RSS feed   Index | Next | Previous | View Threaded


dada1 at cosmosbay

May 18, 2007, 2:54 AM

Post #1 of 9 (420 views)
Permalink
[PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize

alloc_large_system_hash() is called at boot time to allocate space for several large hash tables.

Lately, TCP hash table was changed and its bucketsize is not a power-of-two anymore.

On most setups, alloc_large_system_hash() allocates one big page (order > 0) with __get_free_pages(GFP_ATOMIC, order). This single high_order page has a power-of-two size, bigger than the needed size.

We can free all pages that wont be used by the hash table.

On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.

TCP established hash table entries: 32768 (order: 6, 393216 bytes)

Signed-off-by: Eric Dumazet <dada1 [at] cosmosbay>
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ae96dd8..2e0ba08 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3350,6 +3350,20 @@ void *__init alloc_large_system_hash(const char *tablename,
for (order = 0; ((1UL << order) << PAGE_SHIFT) < size; order++)
;
table = (void*) __get_free_pages(GFP_ATOMIC, order);
+ /*
+ * If bucketsize is not a power-of-two, we may free
+ * some pages at the end of hash table.
+ */
+ if (table) {
+ unsigned long alloc_end = (unsigned long)table +
+ (PAGE_SIZE << order);
+ unsigned long used = (unsigned long)table +
+ PAGE_ALIGN(size);
+ while (used < alloc_end) {
+ free_page(used);
+ used += PAGE_SIZE;
+ }
+ }
}
} while (!table && size > PAGE_SIZE && --log2qty);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


clameter at sgi

May 18, 2007, 11:21 AM

Post #2 of 9 (405 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

On Fri, 18 May 2007, Eric Dumazet wrote:

> table = (void*) __get_free_pages(GFP_ATOMIC, order);

ATOMIC? Is there some reason why we need atomic here?

> + /*
> + * If bucketsize is not a power-of-two, we may free
> + * some pages at the end of hash table.
> + */
> + if (table) {
> + unsigned long alloc_end = (unsigned long)table +
> + (PAGE_SIZE << order);
> + unsigned long used = (unsigned long)table +
> + PAGE_ALIGN(size);
> + while (used < alloc_end) {
> + free_page(used);

Isnt this going to interfere with the kernel_map_pages debug stuff?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


akpm at linux-foundation

May 19, 2007, 1:37 AM

Post #3 of 9 (402 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

On Fri, 18 May 2007 11:54:54 +0200 Eric Dumazet <dada1 [at] cosmosbay> wrote:

> alloc_large_system_hash() is called at boot time to allocate space for several large hash tables.
>
> Lately, TCP hash table was changed and its bucketsize is not a power-of-two anymore.
>
> On most setups, alloc_large_system_hash() allocates one big page (order > 0) with __get_free_pages(GFP_ATOMIC, order). This single high_order page has a power-of-two size, bigger than the needed size.

Watch the 200-column text, please.

> We can free all pages that wont be used by the hash table.
>
> On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.
>
> TCP established hash table entries: 32768 (order: 6, 393216 bytes)
>
> Signed-off-by: Eric Dumazet <dada1 [at] cosmosbay>
> ---
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index ae96dd8..2e0ba08 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3350,6 +3350,20 @@ void *__init alloc_large_system_hash(const char *tablename,
> for (order = 0; ((1UL << order) << PAGE_SHIFT) < size; order++)
> ;
> table = (void*) __get_free_pages(GFP_ATOMIC, order);
> + /*
> + * If bucketsize is not a power-of-two, we may free
> + * some pages at the end of hash table.
> + */
> + if (table) {
> + unsigned long alloc_end = (unsigned long)table +
> + (PAGE_SIZE << order);
> + unsigned long used = (unsigned long)table +
> + PAGE_ALIGN(size);
> + while (used < alloc_end) {
> + free_page(used);
> + used += PAGE_SIZE;
> + }
> + }
> }
> } while (!table && size > PAGE_SIZE && --log2qty);
>

It went BUG.

static inline int put_page_testzero(struct page *page)
{
VM_BUG_ON(atomic_read(&page->_count) == 0);
return atomic_dec_and_test(&page->_count);
}

http://userweb.kernel.org/~akpm/s5000523.jpg
http://userweb.kernel.org/~akpm/config-vmm.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dada1 at cosmosbay

May 19, 2007, 11:07 AM

Post #4 of 9 (398 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

Andrew Morton a écrit :
> On Fri, 18 May 2007 11:54:54 +0200 Eric Dumazet <dada1 [at] cosmosbay> wrote:
>
>> alloc_large_system_hash() is called at boot time to allocate space for several large hash tables.
>>
>> Lately, TCP hash table was changed and its bucketsize is not a power-of-two anymore.
>>
>> On most setups, alloc_large_system_hash() allocates one big page (order > 0) with __get_free_pages(GFP_ATOMIC, order). This single high_order page has a power-of-two size, bigger than the needed size.
>
> Watch the 200-column text, please.
>
>> We can free all pages that wont be used by the hash table.
>>
>> On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.
>>
>> TCP established hash table entries: 32768 (order: 6, 393216 bytes)
>>
>> Signed-off-by: Eric Dumazet <dada1 [at] cosmosbay>
>> ---
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index ae96dd8..2e0ba08 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3350,6 +3350,20 @@ void *__init alloc_large_system_hash(const char *tablename,
>> for (order = 0; ((1UL << order) << PAGE_SHIFT) < size; order++)
>> ;
>> table = (void*) __get_free_pages(GFP_ATOMIC, order);
>> + /*
>> + * If bucketsize is not a power-of-two, we may free
>> + * some pages at the end of hash table.
>> + */
>> + if (table) {
>> + unsigned long alloc_end = (unsigned long)table +
>> + (PAGE_SIZE << order);
>> + unsigned long used = (unsigned long)table +
>> + PAGE_ALIGN(size);
>> + while (used < alloc_end) {
>> + free_page(used);
>> + used += PAGE_SIZE;
>> + }
>> + }
>> }
>> } while (!table && size > PAGE_SIZE && --log2qty);
>>
>
> It went BUG.
>
> static inline int put_page_testzero(struct page *page)
> {
> VM_BUG_ON(atomic_read(&page->_count) == 0);
> return atomic_dec_and_test(&page->_count);
> }
>
> http://userweb.kernel.org/~akpm/s5000523.jpg
> http://userweb.kernel.org/~akpm/config-vmm.txt

I see :(

Maybe David has an idea how this can be done properly ?

ref : http://marc.info/?l=linux-netdev&m=117706074825048&w=2


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


wli at holomorphy

May 19, 2007, 11:21 AM

Post #5 of 9 (397 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

On Fri, May 18, 2007 at 11:54:54AM +0200, Eric Dumazet wrote:
> alloc_large_system_hash() is called at boot time to allocate space
> for several large hash tables.
> Lately, TCP hash table was changed and its bucketsize is not a
> power-of-two anymore.
> On most setups, alloc_large_system_hash() allocates one big page
> (order > 0) with __get_free_pages(GFP_ATOMIC, order). This single
> high_order page has a power-of-two size, bigger than the needed size.
> We can free all pages that wont be used by the hash table.
> On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.
> TCP established hash table entries: 32768 (order: 6, 393216 bytes)

The proper way to do this is to convert the large system hashtable
users to use some data structure / algorithm other than hashing by
separate chaining.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dada1 at cosmosbay

May 19, 2007, 11:41 AM

Post #6 of 9 (400 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

William Lee Irwin III a écrit :
> On Fri, May 18, 2007 at 11:54:54AM +0200, Eric Dumazet wrote:
>> alloc_large_system_hash() is called at boot time to allocate space
>> for several large hash tables.
>> Lately, TCP hash table was changed and its bucketsize is not a
>> power-of-two anymore.
>> On most setups, alloc_large_system_hash() allocates one big page
>> (order > 0) with __get_free_pages(GFP_ATOMIC, order). This single
>> high_order page has a power-of-two size, bigger than the needed size.
>> We can free all pages that wont be used by the hash table.
>> On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.
>> TCP established hash table entries: 32768 (order: 6, 393216 bytes)
>
> The proper way to do this is to convert the large system hashtable
> users to use some data structure / algorithm other than hashing by
> separate chaining.

No thanks. This was already discussed to death on netdev. To date, hash tables
are a good compromise.

I dont mind losing part of memory, I prefer to keep good performance when
handling 1.000.000 or more tcp sessions.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


davem at davemloft

May 19, 2007, 11:54 AM

Post #7 of 9 (394 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

From: Eric Dumazet <dada1 [at] cosmosbay>
Date: Sat, 19 May 2007 20:07:11 +0200

> Maybe David has an idea how this can be done properly ?
>
> ref : http://marc.info/?l=linux-netdev&m=117706074825048&w=2

You need to use __GFP_COMP or similar to make this splitting+freeing
thing work.

Otherwise the individual pages don't have page references, only
the head page of the high-order page will.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


dada1 at cosmosbay

May 19, 2007, 1:36 PM

Post #8 of 9 (394 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

David Miller a écrit :
> From: Eric Dumazet <dada1 [at] cosmosbay>
> Date: Sat, 19 May 2007 20:07:11 +0200
>
>> Maybe David has an idea how this can be done properly ?
>>
>> ref : http://marc.info/?l=linux-netdev&m=117706074825048&w=2
>
> You need to use __GFP_COMP or similar to make this splitting+freeing
> thing work.
>
> Otherwise the individual pages don't have page references, only
> the head page of the high-order page will.
>

Oh thanks David for the hint.

I added a split_page() call and it seems to work now.


[PATCH] MM : alloc_large_system_hash() can free some memory for non
power-of-two bucketsize

alloc_large_system_hash() is called at boot time to allocate space for several
large hash tables.

Lately, TCP hash table was changed and its bucketsize is not a power-of-two
anymore.

On most setups, alloc_large_system_hash() allocates one big page (order > 0)
with __get_free_pages(GFP_ATOMIC, order). This single high_order page has a
power-of-two size, bigger than the needed size.

We can free all pages that wont be used by the hash table.

On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory.

TCP established hash table entries: 32768 (order: 6, 393216 bytes)

Signed-off-by: Eric Dumazet <dada1 [at] cosmosbay>
Attachments: alloc_large.patch (0.80 KB)


wli at holomorphy

May 21, 2007, 1:11 AM

Post #9 of 9 (385 views)
Permalink
Re: [PATCH] MM : alloc_large_system_hash() can free some memory for non power-of-two bucketsize [In reply to]

William Lee Irwin III a ?crit :
>> The proper way to do this is to convert the large system hashtable
>> users to use some data structure / algorithm other than hashing by
>> separate chaining.

On Sat, May 19, 2007 at 08:41:01PM +0200, Eric Dumazet wrote:
> No thanks. This was already discussed to death on netdev. To date, hash
> tables are a good compromise.
> I dont mind losing part of memory, I prefer to keep good performance when
> handling 1.000.000 or more tcp sessions.

The data structures perform well enough, but I suppose it's not worth
pushing the issue this way.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo [at] vger
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linux kernel RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.