1. 13 Aug, 2009 1 commit
  2. 23 Jul, 2009 1 commit
  3. 13 Aug, 2009 1 commit
  4. 23 Jul, 2009 1 commit
  5. 05 Jun, 2009 1 commit
  6. 13 May, 2009 1 commit
  7. 02 Apr, 2009 1 commit
  8. 13 Aug, 2009 1 commit
  9. 23 Jul, 2009 1 commit
  10. 13 Jul, 2009 1 commit
  11. 13 Aug, 2009 1 commit
  12. 12 Aug, 2009 1 commit
  13. 31 Jul, 2009 1 commit
  14. 23 Jul, 2009 1 commit
  15. 13 Aug, 2009 1 commit
  16. 04 Aug, 2009 1 commit
  17. 23 Jul, 2009 1 commit
  18. 13 Aug, 2009 1 commit
  19. 14 Feb, 2009 2 commits
  20. 21 Aug, 2009 1 commit
  21. 20 Aug, 2009 5 commits
    • Jan Beulich's avatar
      This is being done by allowing boot time allocations to specify that they · d8b838de
      Jan Beulich authored
      may want a sub-page sized amount of memory.
      
      Overall this seems more consistent with the other hash table allocations,
      and allows making two supposedly mm-only variables really mm-only
      (nr_{kernel,all}_pages).
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d8b838de
    • Jan Beulich's avatar
      Since alloc_bootmem() will never return inaccessible (via virtual · 1326d566
      Jan Beulich authored
      addressing) memory anyway, using the ..._low() variant only makes sense
      when the physical address range of the allocated memory must fulfill
      further constraints, espacially since on 64-bits (or more generally in all
      cases where the pools the two variants allocate from are than the full
      available range.
      
      Probably the use in alloc_tce_table() could also be eliminated (based on
      code inspection of pci-calgary_64.c), but that seems too risky given I
      know nothing about that hardware and have no way to test it.
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1326d566
    • Jan Beulich's avatar
      Sizing of memory allocations shouldn't depend on the number of physical · 8f38cb9c
      Jan Beulich authored
      pages found in a system, as that generally includes (perhaps a huge amount
      of) non-RAM pages.  The amount of what actually is usable as storage
      should instead be used as a basis here.
      
      Some of the calculations (i.e.  those not intending to use high memory)
      should likely even use (totalram_pages - totalhigh_pages).
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Dave Airlie <airlied@linux.ie>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8f38cb9c
    • Jan Beulich's avatar
      Sizing of memory allocations shouldn't depend on the number of physical · e5fca762
      Jan Beulich authored
      pages found in a system, as that generally includes (perhaps a huge amount
      of) non-RAM pages.  The amount of what actually is usable as storage
      should instead be used as a basis here.
      
      In line with that, the memory hotplug code should update num_physpages in
      a way that it retains its original (post-boot) meaning; in particular,
      decreasing the value should at best be done with great care - this patch
      doesn't try to ever decrease this value at all as it doesn't really seem
      meaningful to do so.
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e5fca762
    • Mel Gorman's avatar
      After anti-fragmentation was merged, a bug was reported whereby devices · f0ef4d63
      Mel Gorman authored
      that depended on high-order atomic allocations were failing.  The solution
      was to preserve a property in the buddy allocator which tended to keep the
      minimum number of free pages in the zone at the lower physical addresses
      and contiguous.  To preserve this property, MIGRATE_RESERVE was introduced
      and a number of pageblocks at the start of a zone would be marked
      "reserve", the number of which depended on min_free_kbytes.
      
      Anti-fragmentation works by avoiding the mixing of page migratetypes
      within the same pageblock.  One way of helping this is to increase
      min_free_kbytes because it becomes less like that it will be necessary to
      place pages of of MIGRATE_RESERVE is unbounded, the free memory is kept
      there in large contiguous blocks instead of helping anti-fragmentation as
      much as it should.  With the page-allocator tracepoint patches applied, it
      was found during anti-fragmentation tests that the number of
      fragmentation-related events were far higher than expected even with
      min_free_kbytes at higher values.
      
      This patch limits the number of MIGRATE_RESERVE blocks that exist per zone
      to two.  For example, with a sufficient min_free_kbytes, 4MB of memory
      will be kept aside on an x86-64 and remain more or less free and
      contiguous for the systems uptime.  This should be sufficient for devices
      depending on high-order atomic allocations while helping fragmentation
      control when min_free_kbytes is tuned appropriately.  As side-effect of
      this patch is that the reserve variable is converted to int as unsigned
      long was the wrong type to use when ensuring that only the required number
      of reserve blocks are created.
      
      With the patches applied, fragmentation-related events as measured by the
      page allocator tracepoints were significantly reduced when running some
      fragmentation stress-tests on systems with min_free_kbytes tuned to a
      value appropriate for hugepage allocations at runtime.  On x86, the events
      recorded were reduced by 99.8%, on x86-64 by 99.72% and on ppc64 by
      99.83%.
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f0ef4d63
  22. 13 Aug, 2009 5 commits
  23. 11 Aug, 2009 2 commits
  24. 12 Aug, 2009 2 commits
  25. 18 Aug, 2009 1 commit
  26. 12 Aug, 2009 2 commits
    • Daisuke Nishimura's avatar
      After commit 355cfa73 ("mm: modify swap_map and add SWAP_HAS_CACHE flag"), · 9ead3bef
      Daisuke Nishimura authored
      only the context which have set SWAP_HAS_CACHE flag by swapcache_prepare()
      or get_swap_page() would call add_to_swap_cache().  So add_to_swap_cache()
      doesn't return -EEXIST any more.
      
      Even though it doesn't return -EEXIST, it's not good behavior conceptually
      to call swapcache_prepare() in the -EEXIST case, because it means clearing
      SWAP_HAS_CACHE flag while the entry is on swap cache.
      
      This patch removes redundant codes and comments from callers of it, and
      adds VM_BUG_ON() in error path of add_to_swap_cache() and some comments.
      Signed-off-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Reviewed-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9ead3bef
    • Daisuke Nishimura's avatar
      After commit 355cfa73 ("mm: modify swap_map and add SWAP_HAS_CACHE flag"), · 676b09ba
      Daisuke Nishimura authored
      read_swap_cache_async() will busy-wait while a entry doesn't exist in swap
      cache but it has SWAP_HAS_CACHE flag.
      
      Such entries can exist on add/delete path of swap cache.  On add path,
      add_to_swap_cache() is called soon after SWAP_HAS_CACHE flag is set, and
      on delete path, swapcache_free() will be called (SWAP_HAS_CACHE flag is
      cleared) soon after __delete_from_swap_cache() is called.  So, the
      busy-wait works well in most cases.
      
      But this mechanism can cause soft lockup if add_to_swap_cache() sleeps and
      read_swap_cache_async() tries to swap-in the same entry on the same cpu.
      
      This patch calls radix_tree_preload() before swapcache_prepare() and
      divides add_to_swap_cache() into two part: radix_tree_preload() part and
      radix_tree_insert() part(define it as __add_to_swap_cache()).
      Signed-off-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      676b09ba
  27. 11 Aug, 2009 2 commits