1. 16 Nov, 2007 34 commits
    • Linus Torvalds's avatar
      revert "x86_64: allocate sparsemem memmap above 4G" · edc0636c
      Linus Torvalds authored
      Reverted upstream by commit 6a22c57b
      
      Revert this commit:
      
      	commit 2e1c49db
      	Author: Zou Nan hai <nanhai.zou@intel.com>
      	Date:   Fri Jun 1 00:46:28 2007 -0700
      	
      	x86_64: allocate sparsemem memmap above 4G
      
      This reverts commit 2e1c49db.
      
      First off, testing in Fedora has shown it to cause boot failures,
      bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
      commit will likely be reverted in the 2.6.23 stable kernels.
      
      Secondly, in the 2.6.24 model, x86-64 has now grown support for
      SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
      bug is not visible any more, it's become invisible due to the code just
      being irrelevant and no longer enabled on the only architecture that
      this ever affected.
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Tested-by: default avatarMartin Ebourne <fedora@ebourne.me.uk>
      Cc: Zou Nan hai <nanhai.zou@intel.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      edc0636c
    • Dave Johnson's avatar
      x86: fix TSC clock source calibration error · 963bbb3b
      Dave Johnson authored
      patch edaf420f in mainline.
      
      I ran into this problem on a system that was unable to obtain NTP sync
      because the clock was running very slow (over 10000ppm slow). ntpd had
      declared all of its peers 'reject' with 'peer_dist' reason.
      
      On investigation, the tsc_khz variable was significantly incorrect
      causing xtime to run slow.  After a reboot tsc_khz was correct so I
      did a reboot test to see how often the problem occurred:
      
      Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
      had unacceptable tsc_khz values (>500ppm):
      
       range of tsc_khz  # of boots  % of boots
       ----------------  ----------  ----------
              < 1999750           0      0.000%
      1999750 - 1999800          21      3.048%
      1999800 - 1999850         166     24.128%
      1999850 - 1999900         241     35.029%
      1999900 - 1999950         211     30.669%
      1999950 - 2000000          42      6.105%
      2000000 - 2000000           0      0.000%
      2000050 - 2000100           0      0.000%
                         [...]
      2000100 - 2015000           1      0.145%  << BAD
      2015000 - 2030000           6      0.872%  << BAD
      2030000 - 2045000           1      0.145%  << BAD
      2045000 <                   0      0.000%
      
      The worst boot was 2032.577 Mhz, over 1.5% off!
      
      It appears that on rare occasions, mach_countup() is taking longer to
      complete than necessary.
      
      I suspect that this is caused by the CPU taking a periodic SMI
      interrupt right at the end of the 30ms calibration loop.  This would
      cause the loop to delay while the SMI BIOS hander runs. The resulting
      TSC value is beyond what it actually should be resulting in a higher
      tsc_khz.
      
      The below patch makes native_calculate_cpu_khz() take the best
      (shortest duration, lowest khz) run of it's 3 calibration loops.  If a
      SMI goes off causing a bad result (long duration, higher khz) it will
      be discarded.
      
      With the patch applied, 300 boots of the same system produce good
      results:
      
       range of tsc_khz  # of boots  % of boots
       ----------------  ----------  ----------
              < 1999750           0      0.000%
      1999750 - 1999800          30     10.000%
      1999800 - 1999850         166     55.333%
      1999850 - 1999900          89     29.667%
      1999900 - 1999950          15      5.000%
      1999950 <                   0      0.000%
      
      Problem was found and tested against 2.6.18.  Patch is against 2.6.22.
      Signed-off-by: default avatarDave Johnson <djohnson@sw.starentnetworks.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      963bbb3b
    • H. Peter Anvin's avatar
      x86 setup: sizeof() is unsigned, unbreak comparisons · 2d49e888
      H. Peter Anvin authored
      patch e6e1ace9 in mainline.
      
      
      We use signed values for limit checking since the values can go
      negative under certain circumstances.  However, sizeof() is unsigned
      and forces the comparison to be unsigned, so move the comparison into
      the heap_free() macros so we can ensure it is a signed comparison.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      2d49e888
    • H. Peter Anvin's avatar
      x86 setup: handle boot loaders which set up the stack incorrectly · 430bb2ee
      H. Peter Anvin authored
      patch 6b6815c6 in mainline.
      
      Apparently some specific versions of LILO enter the kernel with a
      stack pointer that doesn't match the rest of the segments.  Make our
      best attempt at untangling the resulting mess.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      430bb2ee
    • Ingo Molnar's avatar
      x86: fix global_flush_tlb() bug · 4b69ffe3
      Ingo Molnar authored
      patch 9a24d04a upstream
      
      While we were reviewing pageattr_32/64.c for unification,
      Thomas Gleixner noticed the following serious SMP bug in
      global_flush_tlb():
      
      	down_read(&init_mm.mmap_sem);
      	list_replace_init(&deferred_pages, &l);
      	up_read(&init_mm.mmap_sem);
      
      this is SMP-unsafe because list_replace_init() done on two CPUs in
      parallel can corrupt the list.
      
      This bug has been introduced about a year ago in the 64-bit tree:
      
             commit ea7322de
             Author: Andi Kleen <ak@suse.de>
             Date:   Thu Dec 7 02:14:05 2006 +0100
      
             [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr
      
                      down_read(&init_mm.mmap_sem);
              -       dpage = xchg(&deferred_pages, NULL);
              +       list_replace_init(&deferred_pages, &l);
                      up_read(&init_mm.mmap_sem);
      
      the xchg() based version was SMP-safe, but list_replace_init() is not.
      So this "cleanup" introduced a nasty bug.
      
      why this bug never become prominent is a mystery - it can probably be
      explained with the (still) relative obscurity of the x86_64 architecture.
      
      the safe fix for now is to write-lock init_mm.mmap_sem.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      4b69ffe3
    • Jeremy Fitzhardinge's avatar
      xfs: eagerly remove vmap mappings to avoid upsetting Xen · ba4312eb
      Jeremy Fitzhardinge authored
      patch ace2e92e in mainline.
      
      XFS leaves stray mappings around when it vmaps memory to make it
      virtually contigious.  This upsets Xen if one of those pages is being
      recycled into a pagetable, since it finds an extra writable mapping of
      the page.
      
      This patch solves the problem in a brute force way, by making XFS
      always eagerly unmap its mappings.
      
      [ Stable: This works around a bug in 2.6.23.  We may come up with a
      better solution for mainline, but this seems like a low-impact fix for
      the stable kernel. ]
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: XFS masters <xfs-masters@oss.sgi.com>
      Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
      Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      ba4312eb
    • Jeremy Fitzhardinge's avatar
      xen: fix incorrect vcpu_register_vcpu_info hypercall argument · 418db154
      Jeremy Fitzhardinge authored
      patch e3d26976 in mainline.
      
      The kernel's copy of struct vcpu_register_vcpu_info was out of date,
      at best causing the hypercall to fail and the guest kernel to fall
      back to the old mechanism, or worse, causing random memory corruption.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Stable Kernel <stable@kernel.org>
      Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
      Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      418db154
    • Jeremy Fitzhardinge's avatar
      xen: deal with stale cr3 values when unpinning pagetables · edf06ad7
      Jeremy Fitzhardinge authored
      patch 9f79991d in mainline.
      
      When a pagetable is no longer in use, it must be unpinned so that its
      pages can be freed.  However, this is only possible if there are no
      stray uses of the pagetable.  The code currently deals with all the
      usual cases, but there's a rare case where a vcpu is changing cr3, but
      is doing so lazily, and the change hasn't actually happened by the time
      the pagetable is unpinned, even though it appears to have been completed.
      
      This change adds a second per-cpu cr3 variable - xen_current_cr3 -
      which tracks the actual state of the vcpu cr3.  It is only updated once
      the actual hypercall to set cr3 has been completed.  Other processors
      wishing to unpin a pagetable can check other vcpu's xen_current_cr3
      values to see if any cross-cpu IPIs are needed to clean things up.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      edf06ad7
    • Jeremy Fitzhardinge's avatar
      xen: add batch completion callbacks · 4fc04833
      Jeremy Fitzhardinge authored
      patch 91e0c5f3 in mainline.
      
      This adds a mechanism to register a callback function to be called once
      a batch of hypercalls has been issued.  This is typically used to unlock
      things which must remain locked until the hypercall has taken place.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      4fc04833
    • Lepton Wu's avatar
      UML - kill subprocesses on exit · 6f8a6ffc
      Lepton Wu authored
      commit a24864a1
      
      uml: definitively kill subprocesses on panic
      
      In a stock 2.6.22.6 kernel, poweroff a user mode linux guest (2.6.22.6 running
      in skas0 mode) will halt the host linux.  I think the reason is the kernel
      thread abort because of a bug.  Then the sys_reboot in process of user mode
      linux guest is not trapped by the user mode linux kernel and is executed by
      host.  I think it is better to make sure all of our children process to quit
      when user mode linux kernel abort.
      
      [ jdike - the kernel process needs to ignore SIGTERM, plus the waitpid/kill
      loop is needed to make sure that all of our children are dead before the
      kernel exits ]
      Signed-off-by: default avatarLepton Wu <ytht.net@gmail.com>
      Signed-off-by: default avatarJeff Dike <jdike@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      6f8a6ffc
    • Jeff Dike's avatar
      UML - stop using libc asm/user.h · 9e6707f3
      Jeff Dike authored
      commit 189872f9 in mainline.
      
      uml: don't use glibc asm/user.h
      
      Stop including asm/user.h from libc - it seems to be disappearing from
      distros.  It's replaced with sys/user.h which defines user_fpregs_struct and
      user_fpxregs_struct instead of user_i387_struct and struct user_fxsr_struct on
      i386.
      
      As a bonus, on x86_64, I get to dump some stupid typedefs which were needed in
      order to get asm/user.h to compile.
      Signed-off-by: default avatarJeff Dike <jdike@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      9e6707f3
    • Jeff Dike's avatar
      UML - Fix kernel vs libc symbols clash · 63140953
      Jeff Dike authored
      commit 818f6ef4 in mainline.
      
      uml: fix an IPV6 libc vs kernel symbol clash
      
      On some systems, with IPV6 configured, there is a clash between the kernel's
      in6addr_any and the one in libc.
      
      This is handled in the usual (gross) way of defining the kernel symbol out of
      the way on the gcc command line.
      Signed-off-by: default avatarJeff Dike <jdike@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      63140953
    • Jeff Dike's avatar
      UML - Stop using libc asm/page.h · 90118e75
      Jeff Dike authored
      commit 71f926f2 in mainline.
      
      uml: stop using libc asm/page.h
      
      Remove includes of asm/page.h from libc code.  This header seems to be
      disappearing, and UML doesn't make much use of it anyway.
      
      The one use, PAGE_SHIFT in stub.h, is handled by copying the constant from the
      kernel side of the house in common_offsets.h.
      
      [ jdike - added arch/um/kernel/skas/clone.c for -stable ]
      Signed-off-by: default avatarJeff Dike <jdike@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      90118e75
    • Michael Ellerman's avatar
      POWERPC: Make sure to of_node_get() the result of pci_device_to_OF_node() · 5361fb20
      Michael Ellerman authored
      patch db220b23 in mainline.
      
      pci_device_to_OF_node() returns the device node attached to a PCI device,
      but doesn't actually grab a reference - we need to do it ourselves.
      Signed-off-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      5361fb20
    • Kumar Gala's avatar
      POWERPC: Fix handling of stfiwx math emulation · 1d841b4f
      Kumar Gala authored
      patch ba02946a in mainline
      
      Its legal for the stfiwx instruction to have RA = 0 as part of its
      effective address calculation.  This is illegal for all other XE
      form instructions.
      
      Add code to compute the proper effective address for stfiwx if
      RA = 0 rather than treating it as illegal.
      Signed-off-by: default avatarKumar Gala <galak@kernel.crashing.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1d841b4f
    • Ralf Baechle's avatar
      MIPS: R1: Fix hazard barriers to make kernels work on R2 also. · 2f51141b
      Ralf Baechle authored
      patch 572afc24 in mainline.
      
      Tested with Malta; inflates malta_defconfig by 3932 bytes.  Ideally there
      should be additional configuration to allow getting rid of this overhead
      but that would be too much complexity at this stage of the release cycle.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      2f51141b
    • Ralf Baechle's avatar
      MIPS: MT: Fix bug in multithreaded kernels. · e989c61a
      Ralf Baechle authored
      patch a76ab5c1 in mainline.
      
      When GDB writes a breakpoint into address area of inferior process the
      kernel needs to invalidate the modified memory in the inferior which
      is done by calling flush_cache_page which in turns calls
      r4k_flush_cache_page and local_r4k_flush_cache_page for VSMP or SMTC
      kernel via r4k_on_each_cpu().
      
      As the VSMP and SMTC SMP kernels for 34K are running on a single shared
      caches it is possible to get away without interprocessor function calls.
      This optimization is implemented in r4k_on_each_cpu, so
      local_r4k_flush_cache_page is only ever called on the local CPU.
      
      This is where the following code in local_r4k_flush_cache_page() strikes:
      
              /*
               * If ownes no valid ASID yet, cannot possibly have gotten
               * this page into the cache.
               */
              if (cpu_context(smp_processor_id(), mm) == 0)
                      return;
      
      On VSMP and SMTC had a function of cpu_context() for each CPU(TC).
      
      So in case another CPU than the CPU executing local_r4k_cache_flush_page
      has not accessed the mm but one of the other CPUs has there may be data
      to be flushed in the cache yet local_r4k_cache_flush_page will falsely
      return leaving the I-cache inconsistent for the breakpoint.
      
      While the issue was discovered with GDB it also exists in
      local_r4k_flush_cache_range() and local_r4k_flush_cache().
      
      Fixed by introducing a new function has_valid_asid which on MT kernels
      returns true if a mm is active on any processor in the system.
      
      This is relativly expensive since for memory acccesses in that loop
      cache misses have to be assumed but it seems the most viable solution
      for 2.6.23 and older -stable kernels.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e989c61a
    • Chris Wright's avatar
      Fix sparc64 MAP_FIXED handling of framebuffer mmaps · efd233a4
      Chris Wright authored
      patch d58aa8c7 in mainline.
      
      From: Chris Wright <chrisw@sous-sol.org>
      Date: Tue, 23 Oct 2007 20:36:14 -0700
      Subject: [PATCH] [SPARC64]: pass correct addr in get_fb_unmapped_area(MAP_FIXED)
      
      Looks like the MAP_FIXED case is using the wrong address hint.  I'd
      expect the comment "don't mess with it" means pass the request
      straight on through, not change the address requested to -ENOMEM.
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      efd233a4
    • David Miller's avatar
      Fix sparc64 niagara optimized RAID xor asm · 71ec6448
      David Miller authored
      patch d060db63 in mainline.
      
      [SPARC64]: Fix register usage in xor_raid_4().
      
      Some typos led to using %i6/%i7 instead of %l6/%l7 in loads which is
      really really bad because those are the frame pointer and return PC.
      
      Based upon a raid5 crash report by Bertrand Joel.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      71ec6448
    • Greg Kroah-Hartman's avatar
      Linux 2.6.23.2 · 553e6a1a
      Greg Kroah-Hartman authored
      553e6a1a
    • Jens Axboe's avatar
      BLOCK: Fix bad sharing of tag busy list on queues with shared tag maps · 0520fb16
      Jens Axboe authored
      patch 6eca9004 in mainline.
      
      For the locking to work, only the tag map and tag bit map may be shared
      (incidentally, I was just explaining this to Nick yesterday, but I
      apparently didn't review the code well enough myself). But we also share
      the busy list!  The busy_list must be queue private, or we need a
      block_queue_tag covering lock as well.
      
      So we have to move the busy_list to the queue. This'll work fine, and
      it'll actually also fix a problem with blk_queue_invalidate_tags() which
      will invalidate tags across all shared queues. This is a bit confusing,
      the low level driver should call it for each queue seperately since
      otherwise you cannot kill tags on just a single queue for eg a hard
      drive that stops responding. Since the function has no callers
      currently, it's not an issue.
      
      This is fixed with commit 6eca9004 in
      Linus' tree.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      0520fb16
    • Hugh Dickins's avatar
      fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE · bba9d994
      Hugh Dickins authored
      patch 487e9bf2 in mainline.
      
      It's possible to provoke unionfs (not yet in mainline, though in mm and
      some distros) to hit shmem_writepage's BUG_ON(page_mapped(page)).  I expect
      it's possible to provoke the 2.6.23 ecryptfs in the same way (but the
      2.6.24 ecryptfs no longer calls lower level's ->writepage).
      
      This came to light with the recent find that AOP_WRITEPAGE_ACTIVATE could
      leak from tmpfs via write_cache_pages and unionfs to userspace.  There's
      already a fix (e4230030 - writeback: don't
      propagate AOP_WRITEPAGE_ACTIVATE) in the tree for that, and it's okay so
      far as it goes; but insufficient because it doesn't address the underlying
      issue, that shmem_writepage expects to be called only by vmscan (relying on
      backing_dev_info capabilities to prevent the normal writeback path from
      ever approaching it).
      
      That's an increasingly fragile assumption, and ramdisk_writepage (the other
      source of AOP_WRITEPAGE_ACTIVATEs) is already careful to check
      wbc->for_reclaim before returning it.  Make the same check in
      shmem_writepage, thereby sidestepping the page_mapped BUG also.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Cc: Erez Zadok <ezk@cs.sunysb.edu>
      Reviewed-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      bba9d994
    • David Miller's avatar
      Fix compat futex hangs. · 59ddd460
      David Miller authored
      [FUTEX]: Fix address computation in compat code.
      
      [ Upstream commit: 3c5fd9c7 ]
      
      compat_exit_robust_list() computes a pointer to the
      futex entry in userspace as follows:
      
      	(void __user *)entry + futex_offset
      
      'entry' is a 'struct robust_list __user *', and
      'futex_offset' is a 'compat_long_t' (typically a 's32').
      
      Things explode if the 32-bit sign bit is set in futex_offset.
      
      Type promotion sign extends futex_offset to a 64-bit value before
      adding it to 'entry'.
      
      This triggered a problem on sparc64 running 32-bit applications which
      would lock up a cpu looping forever in the fault handling for the
      userspace load in handle_futex_death().
      
      Compat userspace runs with address masking (wherein the cpu zeros out
      the top 32-bits of every effective address given to a memory operation
      instruction) so the sparc64 fault handler accounts for this by
      zero'ing out the top 32-bits of the fault address too.
      
      Since the kernel properly uses the compat_uptr interfaces, kernel side
      accesses to compat userspace work too since they will only use
      addresses with the top 32-bit clear.
      
      Because of this compat futex layer bug we get into the following loop
      when executing the get_user() load near the top of handle_futex_death():
      
      1) load from address '0xfffffffff7f16bd8', FAULT
      2) fault handler clears upper 32-bits, processes fault
         for address '0xf7f16bd8' which succeeds
      3) goto #1
      
      I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
      for their tireless efforts helping me track down this bug.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      59ddd460
    • Frans Pop's avatar
      sched: keep utime/stime monotonic · e823c33c
      Frans Pop authored
      sched: keep utime/stime monotonic
      
      cpustats use utime/stime as a ratio against sum_exec_runtime, as a
      consequence it can happen - when the ratio changes faster than time
      accumulates - that either can be appear to go backwards.
      
      Combined backport for 2.6.23 of the following patches from mainline:
      commit 73a2bcb0
      Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
        sched: keep utime/stime monotonic
      
      commit 9301899b
      Author: Balbir Singh <balbir@linux.vnet.ibm.com>
        sched: fix /proc/<PID>/stat stime/utime monotonicity, part 2
      Signed-off-by: default avatarFrans Pop <elendil@planet.nl>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Balbir Singh <balbir@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e823c33c
    • Ingo Molnar's avatar
      fix the softlockup watchdog to actually work · 436e61d9
      Ingo Molnar authored
      patch a115d5ca in mainline.
      
      this Xen related commit:
      
         commit 966812dc
         Author: Jeremy Fitzhardinge <jeremy@goop.org>
         Date:   Tue May 8 00:28:02 2007 -0700
      
             Ignore stolen time in the softlockup watchdog
      
      broke the softlockup watchdog to never report any lockups. (!)
      
      print_timestamp defaults to 0, this makes the following condition
      always true:
      
      	if (print_timestamp < (touch_timestamp + 1) ||
      
      and we'll in essence never report soft lockups.
      
      apparently the functionality of the soft lockup watchdog was never
      actually tested with that patch applied ...
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      436e61d9
    • Jens Axboe's avatar
      splice: fix double kunmap() in vmsplice copy path · 4d03fda8
      Jens Axboe authored
      patch 6866bef4 in mainline.
      
      The out label should not include the unmap, the only way to jump
      there already has unmapped the source.
      
      00002000
             f7c21a00 00000000 00000000 c0489036 00018e32 00000002 00000000
      00001000
      Call Trace:
       [<c0487dd9>] pipe_to_user+0xca/0xd3
       [<c0488233>] __splice_from_pipe+0x53/0x1bd
       [<c0454947>] ------------[ cut here ]------------
      filemap_fault+0x221/0x380
       [<c0487d0f>] pipe_to_user+0x0/0xd3
       [<c0489036>] sys_vmsplice+0x3b7/0x422
       [<c045ec3f>] kernel BUG at mm/highmem.c:206!
      handle_mm_fault+0x4d5/0x8eb
       [<c041ed5b>] kmap_atomic+0x1c/0x20
       [<c045d33d>] unmap_vmas+0x3d1/0x584
       [<c045f717>] free_pgtables+0x90/0xa0
       [<c041d84b>] pgd_dtor+0x0/0x1
       [<c044d665>] audit_syscall_exit+0x2aa/0x2c6
       [<c0407817>] do_syscall_trace+0x124/0x169
       [<c0404df2>] syscall_call+0x7/0xb
       =======================
      Code: 2d 00 d0 5b 00 25 00 00 e0 ff 29 invalid opcode: 0000 [#1]
      c2 89 d0 c1 e8 0c 8b 14 85 a0 6c 7c c0 4a 85 d2 89 14 85 a0 6c 7c c0 74 07
      31 c9 4a 75 15 eb 04 <0f> 0b eb fe 31 c9 81 3d 78 38 6d c0 78 38 6d c0 0f
      95 c1 b0 01
      EIP: [<c045bbc3>] kunmap_high+0x51/0x8e SS:ESP 0068:f5960df0
      SMP
      Modules linked in: netconsole autofs4 hidp nfs lockd nfs_acl rfcomm l2cap
      bluetooth sunrpc ipv6 ib_iser rdma_cm ib_cm iw_cmib_sa ib_mad ib_core
      ib_addr iscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath
      dm_mod video output sbs batteryac parport_pc lp parport sg i2c_piix4
      i2c_core floppy cfi_probe gen_probe scb2_flash mtd chipreg tg3 e1000 button
      ide_cd serio_raw cdrom aic7xxx scsi_transport_spi sd_mod scsi_mod ext3 jbd
      ehci_hcd ohci_hcd uhci_hcd
      CPU:    3
      EIP:    0060:[<c045bbc3>]    Not tainted VLI
      EFLAGS: 00010246   (2.6.23 #1)
      EIP is at kunmap_high+0x51/0x8e
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      4d03fda8
    • Andrew Morton's avatar
      writeback: don't propagate AOP_WRITEPAGE_ACTIVATE · 2e25e433
      Andrew Morton authored
      patch e4230030 in mainline.
      
      This is a writeback-internal marker but we're propagating it all the way back
      to userspace!.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      
      2e25e433
    • Christoph Lameter's avatar
      SLUB: Fix memory leak by not reusing cpu_slab · f8b98ff9
      Christoph Lameter authored
      patch 05aa3450 in mainline.
      
      SLUB: Fix memory leak by not reusing cpu_slab
      
      Fix the memory leak that may occur when we attempt to reuse a cpu_slab
      that was allocated while we reenabled interrupts in order to be able to
      grow a slab cache. The per cpu freelist may contain objects and in that
      situation we may overwrite the per cpu freelist pointer loosing objects.
      This only occurs if we find that the concurrently allocated slab fits
      our allocation needs.
      
      If we simply always deactivate the slab then the freelist will be properly
      reintegrated and the memory leak will go away.
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f8b98ff9
    • Tsugikazu Shibata's avatar
      HOWTO: update ja_JP/HOWTO with latest changes · 0ebc8ca8
      Tsugikazu Shibata authored
      patch 3b6662f1 upstream.
      
      Here is another sync patch of Documentation/ja_JP/HOWTO
      
      Japanese developer sent me some cosmetic changes and also follow
      changes of HOWTO
          Cross reference URL (sosdg.org/qiyong/lxr)
          known_regression explanations on kernel dev. process
      Signed-off-by: default avatarTsugikazu Shibata <tshibata@ab.jp.nec.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      0ebc8ca8
    • Jan Kiszka's avatar
      fix param_sysfs_builtin name length check · e8af293b
      Jan Kiszka authored
      patch 22800a28 in mainline.
      
      Commit faf8c714 caused a regression:
      parameter names longer than MAX_KBUILD_MODNAME will now be rejected,
      although we just need to keep the module name part that short.  This patch
      restores the old behaviour while still avoiding that memchr is called with
      its length parameter larger than the total string length.
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@web.de>
      Cc: Dave Young <hidave.darkstar@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e8af293b
    • Dave Young's avatar
      param_sysfs_builtin memchr argument fix · 2b5ee286
      Dave Young authored
      patch faf8c714 in mainline.
      
      If memchr argument is longer than strlen(kp->name), there will be some
      weird result.
      
      It will casuse duplicate filenames in sysfs for the "nousb".  kernel
      warning messages are as bellow:
      
      sysfs: duplicate filename 'usbcore' can not be created
      WARNING: at fs/sysfs/dir.c:416 sysfs_add_one()
       [<c01c4750>] sysfs_add_one+0xa0/0xe0
       [<c01c4ab8>] create_dir+0x48/0xb0
       [<c01c4b69>] sysfs_create_dir+0x29/0x50
       [<c024e0fb>] create_dir+0x1b/0x50
       [<c024e3b6>] kobject_add+0x46/0x150
       [<c024e2da>] kobject_init+0x3a/0x80
       [<c053b880>] kernel_param_sysfs_setup+0x50/0xb0
       [<c053b9ce>] param_sysfs_builtin+0xee/0x130
       [<c053ba33>] param_sysfs_init+0x23/0x60
       [<c024d062>] __next_cpu+0x12/0x20
       [<c052aa30>] kernel_init+0x0/0xb0
       [<c052aa30>] kernel_init+0x0/0xb0
       [<c052a856>] do_initcalls+0x46/0x1e0
       [<c01bdb12>] create_proc_entry+0x52/0x90
       [<c0158d4c>] register_irq_proc+0x9c/0xc0
       [<c01bda94>] proc_mkdir_mode+0x34/0x50
       [<c052aa30>] kernel_init+0x0/0xb0
       [<c052aa92>] kernel_init+0x62/0xb0
       [<c0104f83>] kernel_thread_helper+0x7/0x14
       =======================
      kobject_add failed for usbcore with -EEXIST, don't try to register things with the same name in the same directory.
       [<c024e466>] kobject_add+0xf6/0x150
       [<c053b880>] kernel_param_sysfs_setup+0x50/0xb0
       [<c053b9ce>] param_sysfs_builtin+0xee/0x130
       [<c053ba33>] param_sysfs_init+0x23/0x60
       [<c024d062>] __next_cpu+0x12/0x20
       [<c052aa30>] kernel_init+0x0/0xb0
       [<c052aa30>] kernel_init+0x0/0xb0
       [<c052a856>] do_initcalls+0x46/0x1e0
       [<c01bdb12>] create_proc_entry+0x52/0x90
       [<c0158d4c>] register_irq_proc+0x9c/0xc0
       [<c01bda94>] proc_mkdir_mode+0x34/0x50
       [<c052aa30>] kernel_init+0x0/0xb0
       [<c052aa92>] kernel_init+0x62/0xb0
       [<c0104f83>] kernel_thread_helper+0x7/0x14
       =======================
      Module 'usbcore' failed to be added to sysfs, error number -17
      The system will be unstable now.
      Signed-off-by: default avatarDave Young <hidave.darkstar@gmail.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      2b5ee286
    • Linus Torvalds's avatar
      Remove broken ptrace() special-case code from file mapping · aead196b
      Linus Torvalds authored
      The kernel has for random historical reasons allowed ptrace() accesses
      to access (and insert) pages into the page cache above the size of the
      file.
      
      However, Nick broke that by mistake when doing the new fault handling in
      commit 54cb8821 ("mm: merge populate and
      nopage into fault (fixes nonlinear)".  The breakage caused a hang with
      gdb when trying to access the invalid page.
      
      The ptrace "feature" really isn't worth resurrecting, since it really is
      wrong both from a portability _and_ from an internal page cache validity
      standpoint.  So this removes those old broken remnants, and fixes the
      ptrace() hang in the process.
      
      Noticed and bisected by Duane Griffin, who also supplied a test-case
      (quoth Nick: "Well that's probably the best bug report I've ever had,
      thanks Duane!").
      
      Cc: Duane Griffin <duaneg@dghda.com>
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      aead196b
    • J. Bruce Fields's avatar
      locks: fix possible infinite loop in posix deadlock detection · f153577e
      J. Bruce Fields authored
      patch 97855b49 in mainline.
      
      It's currently possible to send posix_locks_deadlock() into an infinite
      loop (under the BKL).
      
      For now, fix this just by bailing out after a few iterations.  We may
      want to fix this in a way that better clarifies the semantics of
      deadlock detection.  But that will take more time, and this minimal fix
      is probably adequate for any realistic scenario, and is simple enough to
      be appropriate for applying to stable kernels now.
      
      Thanks to George Davis for reporting the problem.
      
      Cc: "George G. Davis" <gdavis@mvista.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Acked-by: default avatarAlan Cox <alan@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      f153577e
    • Gregory Haskins's avatar
      lockdep: fix mismatched lockdep_depth/curr_chain_hash · e354b801
      Gregory Haskins authored
      patch 3aa416b0 in mainline.
      
      
       It is possible for the current->curr_chain_key to become inconsistent with the
       current index if the chain fails to validate.  The end result is that future
       lock_acquire() operations may inadvertently fail to find a hit in the cache
       resulting in a new node being added to the graph for every acquire.
      Signed-off-by: default avatarGregory Haskins <ghaskins@novell.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      e354b801
  2. 12 Oct, 2007 2 commits
  3. 09 Oct, 2007 4 commits