• Tejun Heo's avatar
    percpu: teach large page allocator about NUMA · a530b795
    Tejun Heo authored
    Large page first chunk allocator is primarily used for NUMA machines;
    however, its NUMA handling is extremely simplistic.  Regardless of
    their proximity, each cpu is put into separate large page just to
    return most of the allocated space back wasting large amount of
    vmalloc space and increasing cache footprint.
    
    This patch teachs NUMA details to large page allocator.  Given
    processor proximity information, pcpu_lpage_build_unit_map() will find
    fitting cpu -> unit mapping in which cpus in LOCAL_DISTANCE share the
    same large page and not too much virtual address space is wasted.
    
    This greatly reduces the unit and thus chunk size and wastes much less
    address space for the first chunk.  For example, on 4/4 NUMA machine,
    the original code occupied 16MB of virtual space for the first chunk
    while the new code only uses 4MB - one 2MB page for each node.
    
    [ Impact: much better space efficiency on NUMA machines ]
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Jan Beulich <JBeulich@novell.com>
    Cc: Andi Kleen <andi@firstfloor.org>
    Cc: David Miller <davem@davemloft.net>
    a530b795
percpu.c 58.9 KB