An error occurred fetching the project authors.
  1. 12 Oct, 2009 18 commits
  2. 07 Oct, 2009 2 commits
  3. 05 Oct, 2009 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 2.6.31.2 · a2822cac
      Greg Kroah-Hartman authored
      a2822cac
    • Wey-Yi Guy's avatar
      iwlwifi: fix unloading driver while scanning · 6bdeaf6f
      Wey-Yi Guy authored
      This is commit 5bddf549 in linux-2.6.
      
      If NetworkManager is busy scanning when user
      tries to unload the module, the driver can not be unloaded
      because HW still scanning.
      
      Make sure driver sends abort scan host command to uCode if it
      is in the middle of scanning during driver unload.
      Signed-off-by: default avatarWey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6bdeaf6f
    • Wey-Yi Guy's avatar
      iwlwifi: traverse linklist to find the valid OTP block · b23b04f4
      Wey-Yi Guy authored
      commit 415e4993 upstream.
      
      For devices using OTP memory, EEPROM image can start from
      any one of the OTP blocks. If shadow RAM is disabled, we need to
      traverse link list to find the last valid block, then start the EEPROM
      image reading.
      
      If OTP is not full, the valid block is the block _before_ the last block
      on the link list; the last block on the link list is the empty block
      ready for next OTP refresh/update.
      
      If OTP is full, then the last block is the valid block to be used for
      configure the device.
      Signed-off-by: default avatarWey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      b23b04f4
    • Wey-Yi Guy's avatar
      iwlagn: modify digital SVR for 1000 · a6a74787
      Wey-Yi Guy authored
      commit 02c06e4a upstream.
      
      On 1000, there are two Switching Voltage Regulators (SVR). The first one
      apply digital voltage level (1.32V) for PCIe block and core. We need to
      use this regulator to solve a stability issue related to noisy DC2DC
      line in the silicon.
      Signed-off-by: default avatarWey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      a6a74787
    • Jay Sternberg's avatar
      iwlwifi: update 1000 series API version to match firmware · ec45b814
      Jay Sternberg authored
      commit cce53aa3 upstream.
      
      firmware file now contains build number so API needs to be updated.
      Signed-off-by: default avatarJay Sternberg <jay.e.sternberg@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ec45b814
    • Jay Sternberg's avatar
      iwlwifi: Handle new firmware file with ucode build number in header · 7b70149f
      Jay Sternberg authored
      commit cc0f555d upstream.
      
      Adding new API version to account for change to ucode file format.  New
      header includes the build number of the ucode.  This build number is the
      SVN revision thus allowing for exact correlation to the code that
      generated it.
      
      The header adds the build number so that older ucode images can also be
      enhanced to include the build in the future.
      
      some cleanup in iwl_read_ucode needed to ensure old header not used and
      reduce unnecessary references through pointer with the data is already
      in heap variable.
      Signed-off-by: default avatarJay Sternberg <jay.e.sternberg@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      7b70149f
    • David Howells's avatar
      NOMMU: Fix MAP_PRIVATE mmap() of objects where the data can be mapped directly · 6b92d44b
      David Howells authored
      commit 645d83c5 upstream.
      
      Fix MAP_PRIVATE mmap() of files and devices where the data in the backing store
      might be mapped directly.  Use the BDI_CAP_MAP_DIRECT capability flag to govern
      whether or not we should be trying to map a file directly.  This can be used to
      determine whether or not a region has been filled in at the point where we call
      do_mmap_shared() or do_mmap_private().
      
      The BDI_CAP_MAP_DIRECT capability flag is cleared by validate_mmap_request() if
      there's any reason we can't use it.  It's also cleared in do_mmap_pgoff() if
      f_op->get_unmapped_area() fails.
      
      Without this fix, attempting to run a program from a RomFS image on a
      non-mappable MTD partition results in a BUG as the kernel attempts XIP, and
      this can be caught in gdb:
      
      Program received signal SIGABRT, Aborted.
      0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
      (gdb) bt
      #0  0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
      #1  0xc005f168 in do_mmap_pgoff (file=0xc31a6620, addr=<value optimized out>, len=3808, prot=3, flags=6146, pgoff=0) at mm/nommu.c:1373
      #2  0xc00a96b8 in elf_fdpic_map_file (params=0xc33fbbec, file=0xc31a6620, mm=0xc31bef60, what=0xc0213144 "executable") at mm.h:1145
      #3  0xc00aa8b4 in load_elf_fdpic_binary (bprm=0xc316cb00, regs=<value optimized out>) at fs/binfmt_elf_fdpic.c:343
      #4  0xc006b588 in search_binary_handler (bprm=0x6, regs=0xc33fbce0) at fs/exec.c:1234
      #5  0xc006c648 in do_execve (filename=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460, regs=0xc33fbce0) at fs/exec.c:1356
      #6  0xc0008cf0 in sys_execve (name=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460) at arch/frv/kernel/process.c:263
      #7  0xc00075dc in __syscall_call () at arch/frv/kernel/entry.S:897
      
      Note that this fix does the following commit differently:
      
      	commit a190887b
      	Author: David Howells <dhowells@redhat.com>
      	Date:   Sat Sep 5 11:17:07 2009 -0700
      	nommu: fix error handling in do_mmap_pgoff()
      Reported-by: default avatarGraff Yang <graff.yang@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6b92d44b
    • Kashyap, Desai's avatar
      mptsas : PAE Kernel more than 4 GB kernel panic · 6177445f
      Kashyap, Desai authored
      commit c55b89fb upstream.
      
      This patch is solving problem for PAE kernel DMA operation.
      On PAE system dma_addr and unsigned long will have different
      values.
      Now dma_addr is not type casted using unsigned long.
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Cc: Jan Beulich <JBeulich@novell.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6177445f
    • Jan Scholz's avatar
      HID: completely remove apple mightymouse from blacklist · 1dbd009c
      Jan Scholz authored
      commit 42960a13 upstream.
      
      Commit fa047e4f "HID: fix inverted
      wheel for bluetooth version of apple mighty mouse" is incomplete. If
      we remove Apple MightyMouse (bluetooth version) from the list of
      apple_devices in drivers/hid/hid-apple.c we have to remove it from
      hid_blacklist in drivers/hid/hid-core.c as well.
      Signed-off-by: default avatarJan Scholz <Scholz@fias.uni-frankfurt.de>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1dbd009c
    • Weirich, Bernhard's avatar
      powerpc: Fix incorrect setting of __HAVE_ARCH_PTE_SPECIAL · c1aa9d2b
      Weirich, Bernhard authored
      [I'm going to fix upstream differently, by having all CPU types
      actually support _PAGE_SPECIAL, but I prefer the simple and obvious
      fix for -stable. -- Ben]
      
      The test that decides whether to define __HAVE_ARCH_PTE_SPECIAL on
      powerpc is bogus and will end up always defining it, even when
      _PAGE_SPECIAL is not supported (in which case it's 0) such as on
      8xx or 40x processors.
      Signed-off-by: default avatarBernhard Weirich <bernhard.weirich@riedel.net>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      c1aa9d2b
    • Rex Feany's avatar
      powerpc/8xx: Fix regression introduced by cache coherency rewrite · 50c744f9
      Rex Feany authored
      commit e0908085 upstream.
      
      After upgrading to the latest kernel on my mpc875 userspace started
      running incredibly slow (hours to get to a shell, even!).
      I tracked it down to commit 8d30c14c,
      that patch removed a work-around for the 8xx. Adding it
      back makes my problem go away.
      Signed-off-by: default avatarRex Feany <rfeany@mrv.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      50c744f9
    • Brian Rogers's avatar
      saa7134: ir-kbd-i2c init data needs a persistent object · f2594aa2
      Brian Rogers authored
      commit 7aedd5ec upstream.
      
      Tested on MSI TV@nywhere Plus.
      
      Original commit message:
      
      ir-kbd-i2c's ir_probe() function can be called much later (i.e. at
      ir-kbd-i2c module load), than the lifetime of a struct IR_i2c_init_data
      allocated off of the stack in cx18_i2c_new_ir() at registration time.
      Make sure we pass a pointer to a persistent IR_i2c_init_data object at
      i2c registration time.
      
      Thanks to Brian Rogers, Dustin Mitchell, Andy Walls and Jean Delvare to
      rise this question.
      
      Before this patch, if ir-kbd-i2c were probed after SAA7134, trash data
      were used.
      
      Compile tested only, but the patch is identical to em28xx one. So, it
      should work properly.
      Original-patch-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      [brian@xyzw.org: backported for 2.6.31]
      Signed-off-by: default avatarBrian Rogers <brian@xyzw.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f2594aa2
    • Brian Rogers's avatar
      em28xx: ir-kbd-i2c init data needs a persistent object · 45af991d
      Brian Rogers authored
      commit d2ebd0f8 upstream.
      
      Original commit message:
      
      ir-kbd-i2c's ir_probe() function can be called much later (i.e. at
      ir-kbd-i2c module load), than the lifetime of a struct IR_i2c_init_data
      allocated off of the stack in cx18_i2c_new_ir() at registration time.
      Make sure we pass a pointer to a persistent IR_i2c_init_data object at
      i2c registration time.
      
      Thanks to Brian Rogers, Dustin Mitchell, Andy Walls and Jean Delvare to
      rise this question.
      
      Before this patch, if ir-kbd-i2c were probed after em28xx, trash data
      were used. After the patch, no matter what order, it is properly
      reported as tested by me:
      
      input: i2c IR (i2c IR (EM2840 Hauppaug as /class/input/input10
      ir-kbd-i2c: i2c IR (i2c IR (EM2840 Hauppaug detected at i2c-4/4-0030/ir0 [em28xx #0]
      Original-patch-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      [brian@xyzw.org: backported for 2.6.31]
      Signed-off-by: default avatarBrian Rogers <brian@xyzw.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      45af991d
    • Chris Wilson's avatar
      drm/i915: Handle ERESTARTSYS during page fault · 1934acbe
      Chris Wilson authored
      commit c715089f upstream.
      
      During a page fault and rebinding the buffer there exists a window for a
      signal to arrive during the i915_wait_request() and trigger a
      ERESTARTSYS. This used to be handled by returning SIGBUS and thereby
      killing the application. Try 'cairo-perf-trace & cairo-test-suite' and
      watch X go boom!
      
      The solution as suggested by H. Peter Anvin is to simply return NOPAGE and
      leave the higher layers to spot we did not fill the page and resubmit
      the page fault.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      [anholt: Mostly squash it with another commit]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1934acbe
    • Michael Abbott's avatar
      Fix idle time field in /proc/uptime · 03429ffa
      Michael Abbott authored
      commit 96830a57 upstream.
      
      Git commit 79741dd3 changes idle cputime accounting, but unfortunately
      the /proc/uptime file hasn't caught up.  Here the idle time calculation
      from /proc/stat is copied over.
      Signed-off-by: default avatarMichael Abbott <michael.abbott@diamond.ac.uk>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      03429ffa
    • Lee Schermerhorn's avatar
      mmap: avoid unnecessary anon_vma lock acquisition in vma_adjust() · 38a76cc6
      Lee Schermerhorn authored
      commit 252c5f94 upstream.
      
      We noticed very erratic behavior [throughput] with the AIM7 shared
      workload running on recent distro [SLES11] and mainline kernels on an
      8-socket, 32-core, 256GB x86_64 platform.  On the SLES11 kernel
      [2.6.27.19+] with Barcelona processors, as we increased the load [10s of
      thousands of tasks], the throughput would vary between two "plateaus"--one
      at ~65K jobs per minute and one at ~130K jpm.  The simple patch below
      causes the results to smooth out at the ~130k plateau.
      
      But wait, there's more:
      
      We do not see this behavior on smaller platforms--e.g., 4 socket/8 core.
      This could be the result of the larger number of cpus on the larger
      platform--a scalability issue--or it could be the result of the larger
      number of interconnect "hops" between some nodes in this platform and how
      the tasks for a given load end up distributed over the nodes' cpus and
      memories--a stochastic NUMA effect.
      
      The variability in the results are less pronounced [on the same platform]
      with Shanghai processors and with mainline kernels.  With 31-rc6 on
      Shanghai processors and 288 file systems on 288 fibre attached storage
      volumes, the curves [jpm vs load] are both quite flat with the patched
      kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
      jpm] than the unpatched kernel.
      
      Profiling indicated that the "slow" runs were incurring high[er]
      contention on an anon_vma lock in vma_adjust(), apparently called from the
      sbrk() system call.
      
      The patch:
      
      A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
      anon_vma lock when we're only adjusting the end of a vma, as is the case
      for brk().  The comment questions whether it's worth while to optimize for
      this case.  Apparently, on the newer, larger x86_64 platforms, with
      interesting NUMA topologies, it is worth while--especially considering
      that the patch [if correct!] is quite simple.
      
      We can detect this condition--no overlap with next vma--by noting a NULL
      "importer".  The anon_vma pointer will also be NULL in this case, so
      simply avoid loading vma->anon_vma to avoid the lock.
      
      However, we DO need to take the anon_vma lock when we're inserting a vma
      ['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
      need to check for 'insert', as well.  And Hugh points out that we should
      also take it when adjusting vm_start (so that rmap.c can rely upon
      vma_address() while it holds the anon_vma lock).
      
      akpm: Zhang Yanmin reprts a 150% throughput improvement with aim7, so it
      might be -stable material even though thiss isn't a regression: "this
      issue is not clear on dual socket Nehalem machine (2*4*2 cpu), but is
      severe on large machine (4*8*2 cpu)"
      
      [hugh.dickins@tiscali.co.uk: test vma start too]
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Tested-by: default avatar"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      38a76cc6
    • Hugh Dickins's avatar
      mm: fix anonymous dirtying · 0a0611ad
      Hugh Dickins authored
      commit 1ac0cb5d upstream.
      
      do_anonymous_page() has been wrong to dirty the pte regardless.
      If it's not going to mark the pte writable, then it won't help
      to mark it dirty here, and clogs up memory with pages which will
      need swap instead of being thrown away.  Especially wrong if no
      overcommit is chosen, and this vma is not yet VM_ACCOUNTed -
      we could exceed the limit and OOM despite no overcommit.
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      0a0611ad
    • Hugh Dickins's avatar
      mm: munlock use follow_page · 45e32d9a
      Hugh Dickins authored
      commit 408e82b7 upstream.
      
      Hiroaki Wakabayashi points out that when mlock() has been interrupted
      by SIGKILL, the subsequent munlock() takes unnecessarily long because
      its use of __get_user_pages() insists on faulting in all the pages
      which mlock() never reached.
      
      It's worse than slowness if mlock() is terminated by Out Of Memory kill:
      the munlock_vma_pages_all() in exit_mmap() insists on faulting in all the
      pages which mlock() could not find memory for; so innocent bystanders are
      killed too, and perhaps the system hangs.
      
      __get_user_pages() does a lot that's silly for munlock(): so remove the
      munlock option from __mlock_vma_pages_range(), and use a simple loop of
      follow_page()s in munlock_vma_pages_range() instead; ignoring absent
      pages, and not marking present pages as accessed or dirty.
      
      (Change munlock() to only go so far as mlock() reached?  That does not
      work out, given the convention that mlock() claims complete success even
      when it has to give up early - in part so that an underlying file can be
      extended later, and those pages locked which earlier would give SIGBUS.)
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarMinchan Kim <minchan.kim@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Reviewed-by: default avatarHiroaki Wakabayashi <primulaelatior@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      45e32d9a
    • Mel Gorman's avatar
      page-allocator: limit the number of MIGRATE_RESERVE pageblocks per zone · 8df6b14f
      Mel Gorman authored
      commit 78986a67 upstream.
      
      After anti-fragmentation was merged, a bug was reported whereby devices
      that depended on high-order atomic allocations were failing.  The solution
      was to preserve a property in the buddy allocator which tended to keep the
      minimum number of free pages in the zone at the lower physical addresses
      and contiguous.  To preserve this property, MIGRATE_RESERVE was introduced
      and a number of pageblocks at the start of a zone would be marked
      "reserve", the number of which depended on min_free_kbytes.
      
      Anti-fragmentation works by avoiding the mixing of page migratetypes
      within the same pageblock.  One way of helping this is to increase
      min_free_kbytes because it becomes less like that it will be necessary to
      place pages of of MIGRATE_RESERVE is unbounded, the free memory is kept
      there in large contiguous blocks instead of helping anti-fragmentation as
      much as it should.  With the page-allocator tracepoint patches applied, it
      was found during anti-fragmentation tests that the number of
      fragmentation-related events were far higher than expected even with
      min_free_kbytes at higher values.
      
      This patch limits the number of MIGRATE_RESERVE blocks that exist per zone
      to two.  For example, with a sufficient min_free_kbytes, 4MB of memory
      will be kept aside on an x86-64 and remain more or less free and
      contiguous for the systems uptime.  This should be sufficient for devices
      depending on high-order atomic allocations while helping fragmentation
      control when min_free_kbytes is tuned appropriately.  As side-effect of
      this patch is that the reserve variable is converted to int as unsigned
      long was the wrong type to use when ensuring that only the required number
      of reserve blocks are created.
      
      With the patches applied, fragmentation-related events as measured by the
      page allocator tracepoints were significantly reduced when running some
      fragmentation stress-tests on systems with min_free_kbytes tuned to a
      value appropriate for hugepage allocations at runtime.  On x86, the events
      recorded were reduced by 99.8%, on x86-64 by 99.72% and on ppc64 by
      99.83%.
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      8df6b14f
    • Lee Schermerhorn's avatar
      hugetlb: restore interleaving of bootmem huge pages (2.6.31) · ee5cb1e6
      Lee Schermerhorn authored
      Not upstream as it is fixed differently in .32
      
      I noticed that alloc_bootmem_huge_page() will only advance to the next
      node on failure to allocate a huge page.  I asked about this on linux-mm
      and linux-numa, cc'ing the usual huge page suspects.  Mel Gorman
      responded:
      
      	I strongly suspect that the same node being used until allocation
      	failure instead of round-robin is an oversight and not deliberate
      	at all. It appears to be a side-effect of a fix made way back in
      	commit 63b4613c ["hugetlb: fix
      	hugepage allocation with memoryless nodes"]. Prior to that patch
      	it looked like allocations would always round-robin even when
      	allocation was successful.
      
      Andy Whitcroft countered that the existing behavior looked like Andi
      Kleen's original implementation and suggested that we ask him.  We did and
      Andy replied that his intention was to interleave the allocations.  So,
      ...
      
      This patch moves the advance of the hstate next node from which to
      allocate up before the test for success of the attempted allocation.  This
      will unconditionally advance the next node from which to alloc,
      interleaving successful allocations over the nodes with sufficient
      contiguous memory, and skipping over nodes that fail the huge page
      allocation attempt.
      
      Note that alloc_bootmem_huge_page() will only be called for huge pages of
      order > MAX_ORDER.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ee5cb1e6