Signed-off-by: Andrea Arcangeli This prefers the allocate zero pages (even if this will empty the zero quicklist) before falling back in the not scalable buddy allocator. This should be better in scalability, though the 200% boost in the microbenchmark isn't always guaranteed this way (the idle task isn't allowed to refill from buddy to hot-cold list). this is incremental with PG_zero-2 of course. --- sles/mm/page_alloc.c.~1~ 2004-10-30 05:39:19.000000000 +0200 +++ sles/mm/page_alloc.c 2004-10-30 16:43:31.944830688 +0200 @@ -800,7 +800,8 @@ __alloc_pages(unsigned int gfp_mask, uns /* zero pages can only be provided of order 0 */ BUG_ON(zero && order); - BUG_ON((gfp_mask & __GFP_ONLY_ZERO) && !(gfp_mask & __GFP_ZERO)); + /* __GFP_ONLY_ZERO requires __GFP_ZERO set to work correctly */ + BUG_ON((gfp_mask & __GFP_ONLY_ZERO) && !zero); /* * We can't allocate from the buddy allocator before even trying @@ -811,13 +812,17 @@ __alloc_pages(unsigned int gfp_mask, uns * efficintly the per-cpu-page resources for this allocation * (as worse we'll leave something available for the next one ;). * - * pass 0 is for the per-cpu-zero list (this is the only "zero" allocator). - * pass 1 is for the per-cpu-hot-cold list. - * pass 2 is the buddy allocator. + * For regular allocations: + * pass 0 is for the per-cpu-hot-cold list. + * pass 1 is for the per-cpu-zero list. + * pass 2 is the buddy allocator. + * + * For __GFP_ZERO allocations + * pass 0 is for the per-cpu-zero list. + * pass 1 is for the per-cpu-hot-cold list. + * pass 2 is the buddy allocator. */ pass = 0; - if (!zero) - pass = 1; if (unlikely(order)) pass = 2; for (; pass < 3; pass++) { @@ -840,10 +845,18 @@ __alloc_pages(unsigned int gfp_mask, uns goto got_pg; } } - if (gfp_mask & __GFP_ONLY_ZERO) - return NULL; - /* downgrade to hot-cold per-cpu-page list */ - gfp_mask &= ~__GFP_ZERO; + if (!pass) { + if (gfp_mask & __GFP_ONLY_ZERO) + return NULL; + /* try the other quicklist */ + gfp_mask ^= __GFP_ZERO; + } else + /* + * Only use the non-zero quicklist from now on + * the VM will not release zeroed memory during + * paging. + */ + gfp_mask &= ~__GFP_ZERO; } for (i = 0; (z = zones[i]) != NULL; i++) @@ -941,7 +953,7 @@ nopage: } return NULL; got_pg: - if (pass == 0) { + if (pass == 0 && zero) { debug_page_zero(page); SetPageZero(page); }