1. 08 Sep, 2008 25 commits
  2. 20 Aug, 2008 15 commits
    • Greg Kroah-Hartman's avatar
      Linux 2.6.26.3 · c373e9c5
      Greg Kroah-Hartman authored
      c373e9c5
    • Suresh Siddha's avatar
      crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore() · faf996d6
      Suresh Siddha authored
      crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore()
      
      [ Upstream commit: e4914012 ]
      
      Wolfgang Walter reported this oops on his via C3 using padlock for
      AES-encryption:
      
      ##################################################################
      
      BUG: unable to handle kernel NULL pointer dereference at 000001f0
      IP: [<c01028c5>] __switch_to+0x30/0x117
      *pde = 00000000
      Oops: 0002 [#1] PREEMPT
      Modules linked in:
      
      Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
      EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
      EIP is at __switch_to+0x30/0x117
      EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
      ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
      DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
      Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
         c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
         c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
      Call Trace:
      [<c03b5b43>] ? schedule+0x285/0x2ff
      [<c0131856>] ? pm_qos_requirement+0x3c/0x53
      [<c0239f54>] ? acpi_processor_idle+0x0/0x434
      [<c01025fe>] ? cpu_idle+0x73/0x7f
      [<c03a4dcd>] ? rest_init+0x61/0x63
      =======================
      
      Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
      around the padlock instructions fix the oops.
      
      Suresh wrote:
      
      These padlock instructions though don't use/touch SSE registers, but it behaves
      similar to other SSE instructions. For example, it might cause DNA faults
      when cr0.ts is set. While this is a spurious DNA trap, it might cause
      oops with the recent fpu code changes.
      
      This is the code sequence  that is probably causing this problem:
      
      a) new app is getting exec'd and it is somewhere in between
      start_thread() and flush_old_exec() in the load_xyz_binary()
      
      b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
      cleared.
      
      c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
      routines in the network stack. This generates a math fault (as
      cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
      in the task's xstate.
      
      d) Return to exec code path, which does start_thread() which does
      free_thread_xstate() and sets xstate pointer to NULL while
      the TS_USEDFPU is still set.
      
      e) At the next context switch from the new exec'd task to another task,
      we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
      This can cause an oops during unlazy_fpu() in __switch_to()
      
      Now:
      
      1) This should happen with or with out pre-emption. Viro also encountered
      similar problem with out CONFIG_PREEMPT.
      
      2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
      kernel_fpu_begin() will manually do a clts() and won't run in to the
      situation of setting TS_USEDFPU in step "c" above.
      
      3) This was working before the fpu changes, because its a spurious
      math fault  which doesn't corrupt any fpu/sse registers and the task's
      math state was always in an allocated state.
      
      With out the recent lazy fpu allocation changes, while we don't see oops,
      there is a possible race still present in older kernels(for example,
      while kernel is using kernel_fpu_begin() in some optimized clear/copy
      page and an interrupt/softirq happens which uses these padlock
      instructions generating DNA fault).
      
      This is the failing scenario that existed even before the lazy fpu allocation
      changes:
      
      0. CPU's TS flag is set
      
      1. kernel using FPU in some optimized copy  routine and while doing
      kernel_fpu_begin() takes an interrupt just before doing clts()
      
      2. Takes an interrupt and ipsec uses padlock instruction. And we
      take a DNA fault as TS flag is still set.
      
      3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts
      
      4. We complete the padlock routine
      
      5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
      the optimized copy routine and does kernel_fpu_end(). At this point,
      we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
      set and not cleared.
      
      6. Now kernel resumes its user operation. And at the next context
      switch, kernel sees it has do a FP save as TS_USEDFPU is still set
      and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
      will take a DNA fault, as cr0.ts is '1' and now, because we are
      in __switch_to(), math_state_restore() will get confused and will
      restore the next task's FP state and will save it in prev tasks's FP state.
      Remember, in __switch_to() we are already on the stack of the next task
      but take a DNA fault for the prev task.
      
      This causes the fpu leakage.
      
      Fix the padlock instruction usage by calling them inside the
      context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
      manually in the interrupt context. This will not generate spurious DNA
      in the  context of the interrupt which will fix the oops encountered and
      the possible FPU leakage issue.
      Reported-and-bisected-by: default avatarWolfgang Walter <wolfgang.walter@stwm.de>
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      faf996d6
    • Dean Hildebrand's avatar
      PCI: Limit VPD length for Broadcom 5708S · f8e6fcea
      Dean Hildebrand authored
      commit 35405f25 upstream
      
      BCM5706S wont work correctly unless VPD length truncated to 128
      Signed-off-by: default avatarDean Hildebrand <dhildeb@us.ibm.com>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f8e6fcea
    • Jeff Layton's avatar
      CIFS: properly account for new user= field in SPNEGO upcall string allocation · bd69c736
      Jeff Layton authored
      commit 66b8bd3c upstream
      
      [CIFS] properly account for new user= field in SPNEGO upcall string allocation
      
      ...it doesn't look like it's being accounted for at the moment. Also
      try to reorganize the calculation to make it a little more evident
      what each piece means.
      
      This should probably go to the stable series as well...
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      bd69c736
    • Alan Stern's avatar
      usb-storage: automatically recognize bad residues · 978e8c75
      Alan Stern authored
      commit 59f4ff2e upstream
      
      This patch (as1119b) will help to reduce the clutter of usb-storage's
      unusual_devs file by automatically detecting some devices that need
      the IGNORE_RESIDUE flag.  The idea is that devices should never return
      a non-zero residue for an INQUIRY or a READ CAPACITY command unless
      they failed to transfer all the requested data.  So if one of these
      commands transfers a standard amount of data but there is a positive
      residue, we know that the residue is bogus and we can set the flag.
      
      This fixes the problems reported in Bugzilla #11125.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Tested-by: default avatarMatthew Frost <artusemrys@sbcglobal.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      978e8c75
    • Alan Stern's avatar
      usb-storage: revert DMA-alignment change for Wireless USB · b0d87b93
      Alan Stern authored
      commit f756cbd4 upstream.
      
      This patch (as1110) reverts an earlier patch meant to help with
      Wireless USB host controllers.  These controllers can have bulk
      maxpacket values larger than 512, which puts unusual constraints on
      the sizes of scatter-gather list elements.  However it turns out that
      the block layer does not provide the support we need to enforce these
      constraints; merely changing the DMA alignment mask doesn't help.
      Hence there's no reason to keep the original patch.  The Wireless USB
      problem will have to be solved a different way.
      
      In addition, there is a reason to get rid of the earlier patch.  By
      dereferencing a pointer stored in the ep_in array of struct
      usb_device, the current code risks an invalid memory access when it
      runs concurrently with device removal.  The members of that array are
      cleared before the driver's disconnect method is called, so it should
      not try to use them.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      b0d87b93
    • Andrew Vasquez's avatar
      qla2xxx: Set an rport's dev_loss_tmo value in a consistent manner. · ac2ef120
      Andrew Vasquez authored
      [ Upstream commit 85821c90 ]
      
      As there's no point in adding a fixed-fudge value (originally 5
      seconds), honor the user settings only.  We also remove the
      driver's dead-callback get_rport_dev_loss_tmo function
      (qla2x00_get_rport_loss_tmo()).
      Signed-off-by: default avatarAndrew Vasquez <andrew.vasquez@qlogic.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      ac2ef120
    • Seokmann Ju's avatar
    • Joerg Roedel's avatar
      x86: fix setup code crashes on my old 486 box · 66b53f4d
      Joerg Roedel authored
      commit 7b27718b upstream
      
      yesterday I tried to reactivate my old 486 box and wanted to install a
      current Linux with latest kernel on it. But it turned out that the
      latest kernel does not boot because the machine crashes early in the
      setup code.
      
      After some debugging it turned out that the problem is the query_ist()
      function. If this interrupt with that function is called the machine
      simply locks up. It looks like a BIOS bug. Looking for a workaround for
      this problem I wrote the attached patch. It checks for the CPUID
      instruction and if it is not implemented it does not call the speedstep
      BIOS function. As far as I know speedstep should be available since some
      Pentium earliest.
      
      Alan Cox observed that it's available since the Pentium II, so cpuid
      levels 4 and 5 can be excluded altogether.
      
      H. Peter Anvin cleaned up the code some more:
      
      > Right in concept, but I dislike the implementation (duplication of the
      > CPU detect code we already have).  Could you try this patch and see if
      > it works for you?
      
      which, with a small modification to fix a build error with it the
      resulting kernel boots on my machine.
      Signed-off-by: default avatarJoerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatar"H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      66b53f4d
    • Jan Beulich's avatar
      x86: fix spin_is_contended() · a6582aa4
      Jan Beulich authored
      commit 7bc069c6 upstream
      
      The masked difference is what needs to be compared against 1, rather
      than the difference of masked values (which can be negative).
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      a6582aa4
    • David S. Miller's avatar
      sparc64: Handle stack trace attempts before irqstacks are setup. · f04a00a3
      David S. Miller authored
      [ Upstream commit 6f63e781 ]
      
      Things like lockdep can try to do stack backtraces before
      the irqstack blocks have been setup.  So don't try to match
      their ranges so early on.
      
      Also, remove unused variable in save_stack_trace().
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f04a00a3
    • David S. Miller's avatar
      sparc64: Implement IRQ stacks. · 0c509d5a
      David S. Miller authored
      [ Upstream commit 4f70f7a9 ]
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      0c509d5a
    • David S. Miller's avatar
      sparc64: Make global reg dumping even more useful. · 664d412a
      David S. Miller authored
      [ Upstream commit 5afe2738 ]
      
      Record one more level of stack frame program counter.
      
      Particularly when lockdep and all sorts of spinlock debugging is
      enabled, figuring out the caller of spin_lock() is difficult when the
      cpu is stuck on the lock.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      664d412a
    • David S. Miller's avatar
      sparc64: Fix recursion in stack overflow detection handling. · fbcd513d
      David S. Miller authored
      [ Upstream commit c7498081 ]
      
      The calls down into prom_printf() when we detect an overflowed stack
      can recurse again since the overflow stack will be "below" the current
      kernel stack limit.
      
      Prevent this by just returning straight if we are on the stack
      overflow safe stack already.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      fbcd513d
    • David S. Miller's avatar
      sparc64: Fix end-of-stack checking in save_stack_trace(). · f8bb164b
      David S. Miller authored
      [ Upstream commit 433c5f70 ]
      
      Bug reported by Alexander Beregalov.
      
      Before we dereference the stack frame or try to peek at the
      pt_regs magic value, make sure the entire object is within
      the kernel stack bounds.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f8bb164b