From: Nick Piggin Mempool is pretty clever. Looks too clever for its own good :) It shouldn't really know so much about page reclaim internals. - don't guess about what effective page reclaim might involve. - don't randomly flush out all dirty data if some unlikely thing happens (alloc returns NULL). page reclaim can (sort of :P) handle it. I think the main motivation is trying to avoid pool->lock at all costs. However the first allocation is attempted with __GFP_WAIT cleared, so it will be 'can_try_harder' if it hits the page allocator. So if allocation still fails, then we can probably afford to hit the pool->lock - and what's the alternative? Try page reclaim and hit zone->lru_lock? A nice upshot is that we don't need to do any fancy memory barriers or do (intentionally) racy access to pool-> fields outside the lock. Signed-off-by: Nick Piggin Signed-off-by: Andrew Morton --- mm/mempool.c | 30 +++++++++--------------------- 1 files changed, 9 insertions(+), 21 deletions(-) diff -puN mm/mempool.c~mempool-simplify-alloc mm/mempool.c --- 25/mm/mempool.c~mempool-simplify-alloc 2005-04-26 05:18:17.000000000 -0700 +++ 25-akpm/mm/mempool.c 2005-04-26 05:18:17.000000000 -0700 @@ -198,36 +198,22 @@ void * mempool_alloc(mempool_t *pool, un void *element; unsigned long flags; DEFINE_WAIT(wait); - int gfp_nowait; + int gfp_temp; + + might_sleep_if(gfp_mask & __GFP_WAIT); gfp_mask |= __GFP_NOMEMALLOC; /* don't allocate emergency reserves */ gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */ gfp_mask |= __GFP_NOWARN; /* failures are OK */ - gfp_nowait = gfp_mask & ~(__GFP_WAIT | __GFP_IO); - might_sleep_if(gfp_mask & __GFP_WAIT); + gfp_temp = gfp_mask & ~__GFP_WAIT; + repeat_alloc: - element = pool->alloc(gfp_nowait, pool->pool_data); + + element = pool->alloc(gfp_temp, pool->pool_data); if (likely(element != NULL)) return element; - /* - * If the pool is less than 50% full and we can perform effective - * page reclaim then try harder to allocate an element. - */ - mb(); - if ((gfp_mask & __GFP_FS) && (gfp_mask != gfp_nowait) && - (pool->curr_nr <= pool->min_nr/2)) { - element = pool->alloc(gfp_mask, pool->pool_data); - if (likely(element != NULL)) - return element; - } - - /* - * Kick the VM at this point. - */ - wakeup_bdflush(0); - spin_lock_irqsave(&pool->lock, flags); if (likely(pool->curr_nr)) { element = remove_element(pool); @@ -240,6 +226,8 @@ repeat_alloc: if (!(gfp_mask & __GFP_WAIT)) return NULL; + /* Now start performing page reclaim */ + gfp_temp = gfp_mask; prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE); mb(); if (!pool->curr_nr) _