1. 06 Oct, 2009 3 commits
  2. 05 Oct, 2009 37 commits
    • Greg Kroah-Hartman's avatar
      Linux 2.6.31.2 · a2822cac
      Greg Kroah-Hartman authored
      a2822cac
    • Wey-Yi Guy's avatar
      iwlwifi: fix unloading driver while scanning · 6bdeaf6f
      Wey-Yi Guy authored
      This is commit 5bddf549 in linux-2.6.
      
      If NetworkManager is busy scanning when user
      tries to unload the module, the driver can not be unloaded
      because HW still scanning.
      
      Make sure driver sends abort scan host command to uCode if it
      is in the middle of scanning during driver unload.
      Signed-off-by: default avatarWey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6bdeaf6f
    • Wey-Yi Guy's avatar
      iwlwifi: traverse linklist to find the valid OTP block · b23b04f4
      Wey-Yi Guy authored
      commit 415e4993 upstream.
      
      For devices using OTP memory, EEPROM image can start from
      any one of the OTP blocks. If shadow RAM is disabled, we need to
      traverse link list to find the last valid block, then start the EEPROM
      image reading.
      
      If OTP is not full, the valid block is the block _before_ the last block
      on the link list; the last block on the link list is the empty block
      ready for next OTP refresh/update.
      
      If OTP is full, then the last block is the valid block to be used for
      configure the device.
      Signed-off-by: default avatarWey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      b23b04f4
    • Wey-Yi Guy's avatar
      iwlagn: modify digital SVR for 1000 · a6a74787
      Wey-Yi Guy authored
      commit 02c06e4a upstream.
      
      On 1000, there are two Switching Voltage Regulators (SVR). The first one
      apply digital voltage level (1.32V) for PCIe block and core. We need to
      use this regulator to solve a stability issue related to noisy DC2DC
      line in the silicon.
      Signed-off-by: default avatarWey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      a6a74787
    • Jay Sternberg's avatar
      iwlwifi: update 1000 series API version to match firmware · ec45b814
      Jay Sternberg authored
      commit cce53aa3 upstream.
      
      firmware file now contains build number so API needs to be updated.
      Signed-off-by: default avatarJay Sternberg <jay.e.sternberg@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ec45b814
    • Jay Sternberg's avatar
      iwlwifi: Handle new firmware file with ucode build number in header · 7b70149f
      Jay Sternberg authored
      commit cc0f555d upstream.
      
      Adding new API version to account for change to ucode file format.  New
      header includes the build number of the ucode.  This build number is the
      SVN revision thus allowing for exact correlation to the code that
      generated it.
      
      The header adds the build number so that older ucode images can also be
      enhanced to include the build in the future.
      
      some cleanup in iwl_read_ucode needed to ensure old header not used and
      reduce unnecessary references through pointer with the data is already
      in heap variable.
      Signed-off-by: default avatarJay Sternberg <jay.e.sternberg@intel.com>
      Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      7b70149f
    • David Howells's avatar
      NOMMU: Fix MAP_PRIVATE mmap() of objects where the data can be mapped directly · 6b92d44b
      David Howells authored
      commit 645d83c5 upstream.
      
      Fix MAP_PRIVATE mmap() of files and devices where the data in the backing store
      might be mapped directly.  Use the BDI_CAP_MAP_DIRECT capability flag to govern
      whether or not we should be trying to map a file directly.  This can be used to
      determine whether or not a region has been filled in at the point where we call
      do_mmap_shared() or do_mmap_private().
      
      The BDI_CAP_MAP_DIRECT capability flag is cleared by validate_mmap_request() if
      there's any reason we can't use it.  It's also cleared in do_mmap_pgoff() if
      f_op->get_unmapped_area() fails.
      
      Without this fix, attempting to run a program from a RomFS image on a
      non-mappable MTD partition results in a BUG as the kernel attempts XIP, and
      this can be caught in gdb:
      
      Program received signal SIGABRT, Aborted.
      0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
      (gdb) bt
      #0  0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
      #1  0xc005f168 in do_mmap_pgoff (file=0xc31a6620, addr=<value optimized out>, len=3808, prot=3, flags=6146, pgoff=0) at mm/nommu.c:1373
      #2  0xc00a96b8 in elf_fdpic_map_file (params=0xc33fbbec, file=0xc31a6620, mm=0xc31bef60, what=0xc0213144 "executable") at mm.h:1145
      #3  0xc00aa8b4 in load_elf_fdpic_binary (bprm=0xc316cb00, regs=<value optimized out>) at fs/binfmt_elf_fdpic.c:343
      #4  0xc006b588 in search_binary_handler (bprm=0x6, regs=0xc33fbce0) at fs/exec.c:1234
      #5  0xc006c648 in do_execve (filename=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460, regs=0xc33fbce0) at fs/exec.c:1356
      #6  0xc0008cf0 in sys_execve (name=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460) at arch/frv/kernel/process.c:263
      #7  0xc00075dc in __syscall_call () at arch/frv/kernel/entry.S:897
      
      Note that this fix does the following commit differently:
      
      	commit a190887b
      	Author: David Howells <dhowells@redhat.com>
      	Date:   Sat Sep 5 11:17:07 2009 -0700
      	nommu: fix error handling in do_mmap_pgoff()
      Reported-by: default avatarGraff Yang <graff.yang@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6b92d44b
    • Kashyap, Desai's avatar
      mptsas : PAE Kernel more than 4 GB kernel panic · 6177445f
      Kashyap, Desai authored
      commit c55b89fb upstream.
      
      This patch is solving problem for PAE kernel DMA operation.
      On PAE system dma_addr and unsigned long will have different
      values.
      Now dma_addr is not type casted using unsigned long.
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@lsi.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Cc: Jan Beulich <JBeulich@novell.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6177445f
    • Jan Scholz's avatar
      HID: completely remove apple mightymouse from blacklist · 1dbd009c
      Jan Scholz authored
      commit 42960a13 upstream.
      
      Commit fa047e4f "HID: fix inverted
      wheel for bluetooth version of apple mighty mouse" is incomplete. If
      we remove Apple MightyMouse (bluetooth version) from the list of
      apple_devices in drivers/hid/hid-apple.c we have to remove it from
      hid_blacklist in drivers/hid/hid-core.c as well.
      Signed-off-by: default avatarJan Scholz <Scholz@fias.uni-frankfurt.de>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1dbd009c
    • Weirich, Bernhard's avatar
      powerpc: Fix incorrect setting of __HAVE_ARCH_PTE_SPECIAL · c1aa9d2b
      Weirich, Bernhard authored
      [I'm going to fix upstream differently, by having all CPU types
      actually support _PAGE_SPECIAL, but I prefer the simple and obvious
      fix for -stable. -- Ben]
      
      The test that decides whether to define __HAVE_ARCH_PTE_SPECIAL on
      powerpc is bogus and will end up always defining it, even when
      _PAGE_SPECIAL is not supported (in which case it's 0) such as on
      8xx or 40x processors.
      Signed-off-by: default avatarBernhard Weirich <bernhard.weirich@riedel.net>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      c1aa9d2b
    • Rex Feany's avatar
      powerpc/8xx: Fix regression introduced by cache coherency rewrite · 50c744f9
      Rex Feany authored
      commit e0908085 upstream.
      
      After upgrading to the latest kernel on my mpc875 userspace started
      running incredibly slow (hours to get to a shell, even!).
      I tracked it down to commit 8d30c14c,
      that patch removed a work-around for the 8xx. Adding it
      back makes my problem go away.
      Signed-off-by: default avatarRex Feany <rfeany@mrv.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      50c744f9
    • Brian Rogers's avatar
      saa7134: ir-kbd-i2c init data needs a persistent object · f2594aa2
      Brian Rogers authored
      commit 7aedd5ec upstream.
      
      Tested on MSI TV@nywhere Plus.
      
      Original commit message:
      
      ir-kbd-i2c's ir_probe() function can be called much later (i.e. at
      ir-kbd-i2c module load), than the lifetime of a struct IR_i2c_init_data
      allocated off of the stack in cx18_i2c_new_ir() at registration time.
      Make sure we pass a pointer to a persistent IR_i2c_init_data object at
      i2c registration time.
      
      Thanks to Brian Rogers, Dustin Mitchell, Andy Walls and Jean Delvare to
      rise this question.
      
      Before this patch, if ir-kbd-i2c were probed after SAA7134, trash data
      were used.
      
      Compile tested only, but the patch is identical to em28xx one. So, it
      should work properly.
      Original-patch-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      [brian@xyzw.org: backported for 2.6.31]
      Signed-off-by: default avatarBrian Rogers <brian@xyzw.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f2594aa2
    • Brian Rogers's avatar
      em28xx: ir-kbd-i2c init data needs a persistent object · 45af991d
      Brian Rogers authored
      commit d2ebd0f8 upstream.
      
      Original commit message:
      
      ir-kbd-i2c's ir_probe() function can be called much later (i.e. at
      ir-kbd-i2c module load), than the lifetime of a struct IR_i2c_init_data
      allocated off of the stack in cx18_i2c_new_ir() at registration time.
      Make sure we pass a pointer to a persistent IR_i2c_init_data object at
      i2c registration time.
      
      Thanks to Brian Rogers, Dustin Mitchell, Andy Walls and Jean Delvare to
      rise this question.
      
      Before this patch, if ir-kbd-i2c were probed after em28xx, trash data
      were used. After the patch, no matter what order, it is properly
      reported as tested by me:
      
      input: i2c IR (i2c IR (EM2840 Hauppaug as /class/input/input10
      ir-kbd-i2c: i2c IR (i2c IR (EM2840 Hauppaug detected at i2c-4/4-0030/ir0 [em28xx #0]
      Original-patch-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      [brian@xyzw.org: backported for 2.6.31]
      Signed-off-by: default avatarBrian Rogers <brian@xyzw.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      45af991d
    • Chris Wilson's avatar
      drm/i915: Handle ERESTARTSYS during page fault · 1934acbe
      Chris Wilson authored
      commit c715089f upstream.
      
      During a page fault and rebinding the buffer there exists a window for a
      signal to arrive during the i915_wait_request() and trigger a
      ERESTARTSYS. This used to be handled by returning SIGBUS and thereby
      killing the application. Try 'cairo-perf-trace & cairo-test-suite' and
      watch X go boom!
      
      The solution as suggested by H. Peter Anvin is to simply return NOPAGE and
      leave the higher layers to spot we did not fill the page and resubmit
      the page fault.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      [anholt: Mostly squash it with another commit]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1934acbe
    • Michael Abbott's avatar
      Fix idle time field in /proc/uptime · 03429ffa
      Michael Abbott authored
      commit 96830a57 upstream.
      
      Git commit 79741dd3 changes idle cputime accounting, but unfortunately
      the /proc/uptime file hasn't caught up.  Here the idle time calculation
      from /proc/stat is copied over.
      Signed-off-by: default avatarMichael Abbott <michael.abbott@diamond.ac.uk>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      03429ffa
    • Lee Schermerhorn's avatar
      mmap: avoid unnecessary anon_vma lock acquisition in vma_adjust() · 38a76cc6
      Lee Schermerhorn authored
      commit 252c5f94 upstream.
      
      We noticed very erratic behavior [throughput] with the AIM7 shared
      workload running on recent distro [SLES11] and mainline kernels on an
      8-socket, 32-core, 256GB x86_64 platform.  On the SLES11 kernel
      [2.6.27.19+] with Barcelona processors, as we increased the load [10s of
      thousands of tasks], the throughput would vary between two "plateaus"--one
      at ~65K jobs per minute and one at ~130K jpm.  The simple patch below
      causes the results to smooth out at the ~130k plateau.
      
      But wait, there's more:
      
      We do not see this behavior on smaller platforms--e.g., 4 socket/8 core.
      This could be the result of the larger number of cpus on the larger
      platform--a scalability issue--or it could be the result of the larger
      number of interconnect "hops" between some nodes in this platform and how
      the tasks for a given load end up distributed over the nodes' cpus and
      memories--a stochastic NUMA effect.
      
      The variability in the results are less pronounced [on the same platform]
      with Shanghai processors and with mainline kernels.  With 31-rc6 on
      Shanghai processors and 288 file systems on 288 fibre attached storage
      volumes, the curves [jpm vs load] are both quite flat with the patched
      kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
      jpm] than the unpatched kernel.
      
      Profiling indicated that the "slow" runs were incurring high[er]
      contention on an anon_vma lock in vma_adjust(), apparently called from the
      sbrk() system call.
      
      The patch:
      
      A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
      anon_vma lock when we're only adjusting the end of a vma, as is the case
      for brk().  The comment questions whether it's worth while to optimize for
      this case.  Apparently, on the newer, larger x86_64 platforms, with
      interesting NUMA topologies, it is worth while--especially considering
      that the patch [if correct!] is quite simple.
      
      We can detect this condition--no overlap with next vma--by noting a NULL
      "importer".  The anon_vma pointer will also be NULL in this case, so
      simply avoid loading vma->anon_vma to avoid the lock.
      
      However, we DO need to take the anon_vma lock when we're inserting a vma
      ['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
      need to check for 'insert', as well.  And Hugh points out that we should
      also take it when adjusting vm_start (so that rmap.c can rely upon
      vma_address() while it holds the anon_vma lock).
      
      akpm: Zhang Yanmin reprts a 150% throughput improvement with aim7, so it
      might be -stable material even though thiss isn't a regression: "this
      issue is not clear on dual socket Nehalem machine (2*4*2 cpu), but is
      severe on large machine (4*8*2 cpu)"
      
      [hugh.dickins@tiscali.co.uk: test vma start too]
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Tested-by: default avatar"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      38a76cc6
    • Hugh Dickins's avatar
      mm: fix anonymous dirtying · 0a0611ad
      Hugh Dickins authored
      commit 1ac0cb5d upstream.
      
      do_anonymous_page() has been wrong to dirty the pte regardless.
      If it's not going to mark the pte writable, then it won't help
      to mark it dirty here, and clogs up memory with pages which will
      need swap instead of being thrown away.  Especially wrong if no
      overcommit is chosen, and this vma is not yet VM_ACCOUNTed -
      we could exceed the limit and OOM despite no overcommit.
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      0a0611ad
    • Hugh Dickins's avatar
      mm: munlock use follow_page · 45e32d9a
      Hugh Dickins authored
      commit 408e82b7 upstream.
      
      Hiroaki Wakabayashi points out that when mlock() has been interrupted
      by SIGKILL, the subsequent munlock() takes unnecessarily long because
      its use of __get_user_pages() insists on faulting in all the pages
      which mlock() never reached.
      
      It's worse than slowness if mlock() is terminated by Out Of Memory kill:
      the munlock_vma_pages_all() in exit_mmap() insists on faulting in all the
      pages which mlock() could not find memory for; so innocent bystanders are
      killed too, and perhaps the system hangs.
      
      __get_user_pages() does a lot that's silly for munlock(): so remove the
      munlock option from __mlock_vma_pages_range(), and use a simple loop of
      follow_page()s in munlock_vma_pages_range() instead; ignoring absent
      pages, and not marking present pages as accessed or dirty.
      
      (Change munlock() to only go so far as mlock() reached?  That does not
      work out, given the convention that mlock() claims complete success even
      when it has to give up early - in part so that an underlying file can be
      extended later, and those pages locked which earlier would give SIGBUS.)
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarMinchan Kim <minchan.kim@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Reviewed-by: default avatarHiroaki Wakabayashi <primulaelatior@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      45e32d9a
    • Mel Gorman's avatar
      page-allocator: limit the number of MIGRATE_RESERVE pageblocks per zone · 8df6b14f
      Mel Gorman authored
      commit 78986a67 upstream.
      
      After anti-fragmentation was merged, a bug was reported whereby devices
      that depended on high-order atomic allocations were failing.  The solution
      was to preserve a property in the buddy allocator which tended to keep the
      minimum number of free pages in the zone at the lower physical addresses
      and contiguous.  To preserve this property, MIGRATE_RESERVE was introduced
      and a number of pageblocks at the start of a zone would be marked
      "reserve", the number of which depended on min_free_kbytes.
      
      Anti-fragmentation works by avoiding the mixing of page migratetypes
      within the same pageblock.  One way of helping this is to increase
      min_free_kbytes because it becomes less like that it will be necessary to
      place pages of of MIGRATE_RESERVE is unbounded, the free memory is kept
      there in large contiguous blocks instead of helping anti-fragmentation as
      much as it should.  With the page-allocator tracepoint patches applied, it
      was found during anti-fragmentation tests that the number of
      fragmentation-related events were far higher than expected even with
      min_free_kbytes at higher values.
      
      This patch limits the number of MIGRATE_RESERVE blocks that exist per zone
      to two.  For example, with a sufficient min_free_kbytes, 4MB of memory
      will be kept aside on an x86-64 and remain more or less free and
      contiguous for the systems uptime.  This should be sufficient for devices
      depending on high-order atomic allocations while helping fragmentation
      control when min_free_kbytes is tuned appropriately.  As side-effect of
      this patch is that the reserve variable is converted to int as unsigned
      long was the wrong type to use when ensuring that only the required number
      of reserve blocks are created.
      
      With the patches applied, fragmentation-related events as measured by the
      page allocator tracepoints were significantly reduced when running some
      fragmentation stress-tests on systems with min_free_kbytes tuned to a
      value appropriate for hugepage allocations at runtime.  On x86, the events
      recorded were reduced by 99.8%, on x86-64 by 99.72% and on ppc64 by
      99.83%.
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      8df6b14f
    • Lee Schermerhorn's avatar
      hugetlb: restore interleaving of bootmem huge pages (2.6.31) · ee5cb1e6
      Lee Schermerhorn authored
      Not upstream as it is fixed differently in .32
      
      I noticed that alloc_bootmem_huge_page() will only advance to the next
      node on failure to allocate a huge page.  I asked about this on linux-mm
      and linux-numa, cc'ing the usual huge page suspects.  Mel Gorman
      responded:
      
      	I strongly suspect that the same node being used until allocation
      	failure instead of round-robin is an oversight and not deliberate
      	at all. It appears to be a side-effect of a fix made way back in
      	commit 63b4613c ["hugetlb: fix
      	hugepage allocation with memoryless nodes"]. Prior to that patch
      	it looked like allocations would always round-robin even when
      	allocation was successful.
      
      Andy Whitcroft countered that the existing behavior looked like Andi
      Kleen's original implementation and suggested that we ask him.  We did and
      Andy replied that his intention was to interleave the allocations.  So,
      ...
      
      This patch moves the advance of the hstate next node from which to
      allocate up before the test for success of the attempted allocation.  This
      will unconditionally advance the next node from which to alloc,
      interleaving successful allocations over the nodes with sufficient
      contiguous memory, and skipping over nodes that fail the huge page
      allocation attempt.
      
      Note that alloc_bootmem_huge_page() will only be called for huge pages of
      order > MAX_ORDER.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ee5cb1e6
    • KAMEZAWA Hiroyuki's avatar
      /proc/kcore: work around a BUG() · 1d831f62
      KAMEZAWA Hiroyuki authored
      Not upstream due to other fixes in .32
      
      
      Works around a BUG() which is triggered when the kernel accesses holes in
      vmalloc regions.
      
      BUG: unable to handle kernel paging request at fa54c000
      IP: [<c04f687a>] read_kcore+0x260/0x31a
      *pde = 3540b067 *pte = 00000000
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:03:00.0/ieee80211/phy0/rfkill0/state
      Modules linked in: fuse sco bridge stp llc bnep l2cap bluetooth sunrpc nf_conntrack_ftp ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath uinput usb_storage arc4 ecb snd_hda_codec_realtek snd_hda_intel ath5k snd_hda_codec snd_hwdep iTCO_wdt snd_pcm iTCO_vendor_support pcspkr i2c_i801 mac80211 joydev snd_timer serio_raw r8169 snd soundcore mii snd_page_alloc ath cfg80211 ata_generic i915 drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
      Sep  4 12:45:16 tuxedu kernel: Pid: 2266, comm: cat Not tainted (2.6.31-rc8 #2) Joybook Lite U101
      EIP: 0060:[<c04f687a>] EFLAGS: 00010286 CPU: 0
      EIP is at read_kcore+0x260/0x31a
      EAX: f5e5ea00 EBX: fa54d000 ECX: 00000400 EDX: 00001000
      ESI: fa54c000 EDI: f44ad000 EBP: e4533f4c ESP: e4533f24
      DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      Process cat (pid: 2266, ti=e4532000 task=f09d19a0 task.ti=e4532000)
      Stack:
      00005000 00000000 f44ad000 09d9c000 00003000 fa54c000 00001000 f6d16f60
       e4520b80 fffffffb e4533f70 c04ef8eb e4533f98 00008000 09d97000 c04f661a
       e4520b80 09d97000 c04ef88c e4533f8c c04ba531 e4533f98 c04c0930 e4520b80
      Call Trace:
      [<c04ef8eb>] ? proc_reg_read+0x5f/0x73
      [<c04f661a>] ? read_kcore+0x0/0x31a
      [<c04ef88c>] ? proc_reg_read+0x0/0x73
      [<c04ba531>] ? vfs_read+0x82/0xe1
      [<c04c0930>] ? path_put+0x1a/0x1d
      [<c04ba62e>] ? sys_read+0x40/0x62
      [<c0403298>] ? sysenter_do_call+0x12/0x2d
      Code: 39 f3 89 ca 0f 43 f3 89 fb 29 f2 29 f3 39 cf 0f 46 d3 29 55 dc 8d 1c 32 f6 40 0c 01 75 18 89 d1 89 f7 c1 e9 02 2b 7d ec 03 7d e0 <f3> a5 89 d1 83 e1 03 74 02 f3 a4 8b 00 83 7d dc 00 74 04 85 c0
      EIP: [<c04f687a>] read_kcore+0x260/0x31a SS:ESP 0068:e4533f24
      CR2: 00000000fa54c000
      
      
      To access vmalloc area which may have memory holes, copy_from_user is
      useful.  So this:
      
       # cat /proc/kcore > /dev/null
      
      will not panic.
      
      This is a minimal fix, suitable for 2.6.30.x and 2.6.31.  More extensive
      /proc/kcore changes are planned for 2.6.32.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Tested-by: default avatarNick Craig-Wood <nick@craig-wood.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Reported-by: <kbowa@tuxedu.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1d831f62
    • Sarah Sharp's avatar
      USB: Fix SS endpoint companion descriptor parsing. · 59664c0b
      Sarah Sharp authored
      commit 6682bb39 upstream.
      
      When there's a descriptor after the SuperSpeed endpoint companion
      descriptor, the previous code would have skipped over twice the length it
      was supposed to.  This code fixes crashes seen with UASP devices (which
      have a UASP descriptor after the SS endpoint companion descriptor).
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      59664c0b
    • Sarah Sharp's avatar
      USB: xhci: Support interrupt transfers. · 1e7198c0
      Sarah Sharp authored
      commit 624defa1 upstream.
      
      Interrupt transfers are submitted to the xHCI hardware using the same TRB
      type as bulk transfers.  Re-use the bulk transfer enqueueing code to
      enqueue interrupt transfers.
      
      Interrupt transfers are a bit different than bulk transfers.  When the
      interrupt endpoint is to be serviced, the xHC will consume (at most) one
      TD.  A TD (comprised of sg list entries) can take several service
      intervals to transmit.  The important thing for device drivers to note is
      that if they use the scatter gather interface to submit interrupt
      requests, they will not get data sent from two different scatter gather
      lists in the same service interval.
      
      For now, the xHCI driver will use the service interval from the endpoint's
      descriptor (bInterval).  Drivers will need a hook to poll at a more
      frequent interval.  Set urb->interval to the interval that the xHCI
      hardware will use.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1e7198c0
    • Sarah Sharp's avatar
      USB: xhci: Set -EREMOTEIO when xHC gives bad transfer length. · 6b65d1c6
      Sarah Sharp authored
      commit 2f697f6c upstream.
      
      The xHCI hardware reports the number of bytes untransferred for a given
      transfer buffer.  If the hardware reports a bytes untransferred value
      greater than the submitted buffer size, we want to play it safe and say no
      data was transferred.  If the driver considers a short packet to be an
      error, remember to set -EREMOTEIO.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6b65d1c6
    • Sarah Sharp's avatar
      USB: xhci: Check URB_SHORT_NOT_OK before setting short packet status. · ff17e475
      Sarah Sharp authored
      commit 204970a4 upstream.
      
      Make sure that the driver that submitted the URB considers a short packet
      an error before setting -EREMOTEIO during a short control transfer.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ff17e475
    • Sarah Sharp's avatar
      USB: xhci: Check URB's actual transfer buffer size. · 1eb1b69f
      Sarah Sharp authored
      commit 99eb32db upstream.
      
      Make sure that the amount of data the xHC says was transmitted is less
      than or equal to the size of the requested transfer buffer.  Before, if
      the host controller erroneously reported that the number of bytes
      untransferred was bigger than the buffer in the URB, urb->actual_length
      could be set to a very large size.
      
      Make sure urb->actual_length <= urb->transfer_buffer_length.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1eb1b69f
    • Sarah Sharp's avatar
      USB: xhci: Don't touch xhci_td after it's freed. · 851834e4
      Sarah Sharp authored
      commit 9191eee7 upstream.
      
      On a successful transfer, urb->td is freed before the URB is ready to be
      given back to the driver.  Don't touch urb->td after it's freed.  This bug
      would have only shown up when xHCI debugging was turned on, and the freed
      memory was quickly reused for something else.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      851834e4
    • Sarah Sharp's avatar
      USB: xhci: Handle babbling endpoints correctly. · 9fa7825a
      Sarah Sharp authored
      commit 83fbcdcc upstream.
      
      The 0.95 xHCI spec says that non-control endpoints will be halted if a
      babble is detected on a transfer.  The 0.96 xHCI spec says all types of
      endpoints will be halted when a babble is detected.  Some hardware that
      claims to be 0.95 compliant halts the control endpoint anyway.
      
      When a babble is detected on a control endpoint, check the hardware's
      output endpoint context to see if the endpoint is marked as halted.  If
      the control endpoint is halted, a reset endpoint command must be issued
      and the transfer ring dequeue pointer needs to be moved past the stopped
      transfer.  Basically, we treat it as if the control endpoint had stalled.
      
      Handle bulk babbling endpoints as if we got a completion event with a
      stall completion code.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      9fa7825a
    • Sarah Sharp's avatar
      USB: xhci: Make TRB completion code comparison readable. · fed392f6
      Sarah Sharp authored
      commit 66d1eebc upstream.
      
      Use trb_comp_code instead of getting the completion code from the transfer
      event every time.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      fed392f6
    • Sarah Sharp's avatar
      USB: xhci: Add quirk for Fresco Logic xHCI hardware. · f9907731
      Sarah Sharp authored
      commit ac9d8fe7 upstream.
      
      This Fresco Logic xHCI host controller chip revision puts bad data into
      the output endpoint context after a Reset Endpoint command.  It needs a
      Configure Endpoint command (instead of a Set TR Dequeue Pointer command)
      after the reset endpoint command.
      
      Set up the input context before issuing the Reset Endpoint command so we
      don't copy bad data from the output endpoint context.  The HW also can't
      handle two commands queued at once, so submit the TRB for the Configure
      Endpoint command in the event handler for the Reset Endpoint command.
      
      Devices that stall on control endpoints before a configuration is selected
      will not work under this Fresco Logic xHCI host controller revision.
      
      This patch is for prototype hardware that will be given to other companies
      for evaluation purposes only, and should not reach consumer hands.  Fresco
      Logic's next chip rev should have this bug fixed.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f9907731
    • Sarah Sharp's avatar
      USB: xhci: Handle stalled control endpoints. · e1edf432
      Sarah Sharp authored
      commit 82d1009f upstream.
      
      When a control endpoint stalls, the next control transfer will clear the
      stall.  The USB core doesn't call down to the host controller driver's
      endpoint_reset() method when control endpoints stall, so the xHCI driver
      has to do all its stall handling for internal state in its interrupt handler.
      
      When the host stalls on a control endpoint, it may stop on the data phase
      or status phase of the control transfer.  Like other stalled endpoints,
      the xHCI driver needs to queue a Reset Endpoint command and move the
      hardware's control endpoint ring dequeue pointer past the failed control
      transfer (with a Set TR Dequeue Pointer or a Configure Endpoint command).
      
      Since the USB core doesn't call usb_hcd_reset_endpoint() for control
      endpoints, we need to do this in interrupt context when we get notified of
      the stalled transfer.  URBs may be queued to the hardware before these two
      commands complete.  The endpoint queue will be restarted once both
      commands complete.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e1edf432
    • Sarah Sharp's avatar
      USB: xhci: Support full speed devices. · 2f670d46
      Sarah Sharp authored
      commit 2d3f1fac upstream.
      
      Full speed devices have varying max packet sizes (8, 16, 32, or 64) for
      endpoint 0.  The xHCI hardware needs to know the real max packet size
      that the USB core discovers after it fetches the first 8 bytes of the
      device descriptor.
      
      In order to fix this without adding a new hook to host controller drivers,
      the xHCI driver looks for an updated max packet size for control
      endpoints.  If it finds an updated size, it issues an evaluate context
      command and waits for that command to finish.  This should only happen in
      the initialization and device descriptor fetching steps in the khubd
      thread, so blocking should be fine.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      2f670d46
    • Sarah Sharp's avatar
      USB: xhci: Set correct max packet size for HS/FS control endpoints. · af7a3a38
      Sarah Sharp authored
      commit 47aded8a upstream.
      
      Set the max packet size for the default control endpoint on high speed
      devices to be 64 bytes.  High speed devices always have a max packet size
      of 64 bytes.  There's no use setting it to eight for the initial 8 byte
      descriptor fetch and then issuing (and waiting for) an evaluate context
      command to update it to 64 bytes for the subsequent control transfers.
      
      The USB core guesses that the max packet size on a full speed control
      endpoint is 64 bytes, and then updates it after the first 8-byte
      descriptor fetch.  Change the initial setup for the xHCI internal
      representation of the full speed device to have a 64 byte max packet size.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      af7a3a38
    • Sarah Sharp's avatar
      USB: xhci: Configure endpoint code refactoring. · e0060ce0
      Sarah Sharp authored
      commit f2217e8e upstream.
      
      Refactor out the code issue, wait for, and parse the event completion code
      for a configure endpoint command.  Modify it to support the evaluate
      context command, which has a very similar submission process.  Add
      functions to copy parts of the output context into the input context
      (which will be used in the evaluate context command).
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e0060ce0
    • Sarah Sharp's avatar
      USB: xhci: Fix slot and endpoint context debugging. · 66bf28ee
      Sarah Sharp authored
      commit 018218d1 upstream.
      
      Use the virtual address of the memory hardware uses, not the address for
      the container of that memory.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      66bf28ee
    • Sarah Sharp's avatar
      USB: xhci: Work around for chain bit in link TRBs. · 15113e82
      Sarah Sharp authored
      commit b0567b3f upstream.
      
      Different sections of the xHCI 0.95 specification had opposing
      requirements for the chain bit in a link transaction request buffer (TRB).
      The chain bit is used to designate that adjacent TRBs are all part of the
      same scatter gather list that should be sent to the device.  Link TRBs can
      be in the middle, or at the beginning or end of these chained TRBs.
      
      Sections 4.11.5.1 and 6.4.4.1 both stated the link TRB "shall have the
      chain bit set to 1", meaning it is always chained to the next TRB.
      However, section 4.6.9 on the stop endpoint command has specific cases for
      what the hardware must do for a link TRB with the chain bit set to 0.  The
      0.96 specification errata later cleared up this issue by fixing the
      4.11.5.1 and 6.4.4.1 sections to state that a link TRB can have the chain
      bit set to 1 or 0.
      
      The problem is that the xHCI cancellation code depends on the chain bit of
      the link TRB being cleared when it's at the end of a TD, and some 0.95
      xHCI hardware simply stops processing the ring when it encounters a link
      TRB with the chain bit cleared.
      
      Allow users who are testing 0.95 xHCI prototypes to set a module parameter
      (link_quirk) to turn on this link TRB work around.  Cancellation may not
      work if the ring is stopped exactly on a link TRB with chain bit set, but
      cancellation should be a relatively uncommon case.
      Signed-off-by: default avatarSarah Sharp <sarah.a.sharp@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      15113e82
    • Alan Stern's avatar
      USB serial: update the console driver · 222a7a98
      Alan Stern authored
      commit 7bd032dc upstream.
      
      This patch (as1292) modifies the USB serial console driver, to make it
      compatible with the recent changes to the USB serial core.  The most
      important change is that serial->disc_mutex now has to be unlocked
      following a successful call to usb_serial_get_by_index().
      
      Other less notable changes include:
      
      	Use the requested port number instead of port 0 always.
      
      	Prevent the serial device from being autosuspended.
      
      	Use the ASYNCB_INITIALIZED flag bit to indicate when the
      	port hardware has been initialized.
      
      In spite of these changes, there's no question that the USB serial
      console code is still a big hack.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      222a7a98