From linux-fsdevel-owner@vger.kernel.org Fri May 13 10:04:18 2011 From: Mel Gorman To: Andrew Morton Cc: James Bottomley , Colin King , Raghavendra D Prabhu , Jan Kara , Chris Mason , Christoph Lameter , Pekka Enberg , Rik van Riel , Johannes Weiner , linux-fsdevel , linux-mm , linux-kernel , linux-ext4 , Mel Gorman Subject: [PATCH 3/4] mm: slub: Do not take expensive steps for SLUBs speculative high-order allocations Date: Fri, 13 May 2011 15:03:23 +0100 Message-Id: <1305295404-12129-4-git-send-email-mgorman@suse.de> X-Mailing-List: linux-fsdevel@vger.kernel.org To avoid locking and per-cpu overhead, SLUB optimisically uses high-order allocations and falls back to lower allocations if they fail. However, by simply trying to allocate, the caller can enter compaction or reclaim - both of which are likely to cost more than the benefit of using high-order pages in SLUB. On a desktop system, two users report that the system is getting stalled with kswapd using large amounts of CPU. This patch prevents SLUB taking any expensive steps when trying to use high-order allocations. Instead, it is expected to fall back to smaller orders more aggressively. Testing was somewhat inconclusive on how much this helped but it makes sense that falling back to order-0 allocations is faster than entering compaction or direct reclaim. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 3 ++- mm/slub.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9f8a97b..057f1e2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1972,6 +1972,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) { int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; const gfp_t wait = gfp_mask & __GFP_WAIT; + const gfp_t can_wake_kswapd = !(gfp_mask & __GFP_NO_KSWAPD); /* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */ BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH); @@ -1984,7 +1985,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask) */ alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH); - if (!wait) { + if (!wait && can_wake_kswapd) { /* * Not worth trying to allocate harder for * __GFP_NOMEMALLOC even if it can't schedule. diff --git a/mm/slub.c b/mm/slub.c index 98c358d..c5797ab 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1170,7 +1170,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * Let the initial higher-order allocation fail under memory pressure * so we fall-back to the minimum order allocation. */ - alloc_gfp = (flags | __GFP_NOWARN | __GFP_NORETRY | __GFP_NO_KSWAPD) & ~__GFP_NOFAIL; + alloc_gfp = (flags | __GFP_NOWARN | __GFP_NO_KSWAPD) & + ~(__GFP_NOFAIL | __GFP_WAIT | __GFP_REPEAT); page = alloc_slab_page(alloc_gfp, node, oo); if (unlikely(!page)) { -- 1.7.3.4