1. 23 Jul, 2009 1 commit
  2. 24 Aug, 2009 1 commit
    • Neil Horman's avatar
      The user mode helper code has a race in it. call_usermodehelper_exec() · 9512a48b
      Neil Horman authored
      takes an allocated subprocess_info structure, which it passes to a
      workqueue, and then passes it to a kernel thread which it creates, after
      which it calls complete to signal to the caller of
      call_usermodehelper_exec() that it can free the subprocess_info struct.
      
      But since we use that structure in the created thread, we can't call
      complete from __call_usermodehelper(), which is where we create the kernel
      thread.  We need to call complete() from within the kernel thread and then
      not use subprocess_info afterward in the case of UMH_WAIT_EXEC.  Tested
      successfully by me.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9512a48b
  3. 29 Jun, 2009 2 commits
  4. 24 Aug, 2009 7 commits
  5. 13 Aug, 2009 1 commit
  6. 23 Jul, 2009 1 commit
  7. 13 Aug, 2009 1 commit
  8. 23 Jul, 2009 1 commit
  9. 13 Aug, 2009 1 commit
  10. 23 Jul, 2009 1 commit
  11. 13 Jul, 2009 1 commit
  12. 13 Aug, 2009 1 commit
  13. 12 Aug, 2009 1 commit
  14. 31 Jul, 2009 1 commit
  15. 23 Jul, 2009 1 commit
  16. 13 Aug, 2009 1 commit
  17. 04 Aug, 2009 1 commit
  18. 23 Jul, 2009 1 commit
  19. 13 Aug, 2009 1 commit
  20. 14 Feb, 2009 2 commits
  21. 13 Sep, 2009 1 commit
  22. 14 Sep, 2009 1 commit
    • Wu Fengguang's avatar
      > @@ -547,20 +541,20 @@ static ssize_t write_kmem(struct file * · 7f61d18b
      Wu Fengguang authored
      >  		if (!kbuf)
      >  			return wrote ? wrote : -ENOMEM;
      >  		while (count > 0) {
      > -			int len = size_inside_page(p, count);
      > +			unsigned long sz = size_inside_page(p, count);
      >
      > -			written = copy_from_user(kbuf, buf, len);
      > -			if (written) {
      > +			sz = copy_from_user(kbuf, buf, sz);
      
      Sorry, it introduced a bug: the "sz" will be zero in normal,
      
      > +			if (sz) {
      >  				if (wrote + virtr)
      >  					break;
      >  				free_page((unsigned long)kbuf);
      >  				return -EFAULT;
      >  			}
      > -			len = vwrite(kbuf, (char *)p, len);
      > +			sz = vwrite(kbuf, (char *)p, sz);
      
      and get passed to vwrite here.
      
      This patch fixes it, the new var "n" will be used in another bug
      fixing patch following this one.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7f61d18b
  23. 13 Sep, 2009 2 commits
  24. 14 Sep, 2009 1 commit
  25. 12 Sep, 2009 5 commits
  26. 11 Sep, 2009 1 commit
    • Lee Schermerhorn's avatar
      We noticed very erratic behavior [throughput] with the AIM7 shared · 87e1a47d
      Lee Schermerhorn authored
      workload running on recent distro [SLES11] and mainline kernels on an
      8-socket, 32-core, 256GB x86_64 platform.  On the SLES11 kernel
      [2.6.27.19+] with Barcelona processors, as we increased the load [10s of
      thousands of tasks], the throughput would vary between two "plateaus"--one
      at ~65K jobs per minute and one at ~130K jpm.  The simple patch below
      causes the results to smooth out at the ~130k plateau.
      
      But wait, there's more:
      
      We do not see this behavior on smaller platforms--e.g., 4 socket/8 core. 
      This could be the result of the larger number of cpus on the larger
      platform--a scalability issue--or it could be the result of the larger
      number of interconnect "hops" between some nodes in this platform and how
      the tasks for a given load end up distributed over the nodes' cpus and
      memories--a stochastic NUMA effect.
      
      The variability in the results are less pronounced [on the same platform]
      with Shanghai processors and with mainline kernels.  With 31-rc6 on
      Shanghai processors and 288 file systems on 288 fibre attached storage
      volumes, the curves [jpm vs load] are both quite flat with the patched
      kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
      jpm] than the unpatched kernel.
      
      Profiling indicated that the "slow" runs were incurring high[er]
      contention on an anon_vma lock in vma_adjust(), apparently called from the
      sbrk() system call.
      
      The patch:
      
      A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
      anon_vma lock when we're only adjusting the end of a vma, as is the case
      for brk().  The comment questions whether it's worth while to optimize for
      this case.  Apparently, on the newer, larger x86_64 platforms, with
      interesting NUMA topologies, it is worth while--especially considering
      that the patch [if correct!] is quite simple.
      
      We can detect this condition--no overlap with next vma--by noting a NULL
      "importer".  The anon_vma pointer will also be NULL in this case, so
      simply avoid loading vma->anon_vma to avoid the lock.  However, we
      apparently DO need to take the anon_vma lock when we're inserting a vma
      ['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
      need to check for 'insert', as well.
      
      I have tested with and without the 'file || ' test in the patch.  This
      does not seem to matter for stability nor performance.  I left this
      check/filter in, so we only optimize away the anon_vma lock acquisition
      when adjusting the end of a non- importing, non-inserting, anon vma.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      87e1a47d
  27. 10 Sep, 2009 1 commit
    • David Miller's avatar
      This is necessary to make the mmap ring buffer work properly on platforms · 9ba02e11
      David Miller authored
      where D-cache aliasing is an issue.
      
      vmalloc_user() ensures that the kernel side mapping is SHMLBA aligned, and
      on platforms with D-cache aliasing matters the presence of VM_SHARED will
      similarly SHMLBA align the user side mapping.
      
      Thus the kernel and the user will be writing to the same D-cache aliases
      and we'll avoid inconsistencies and corruption.
      
      The only trick with this change is that vfree() cannot be invoked from
      interrupt context, and thus it's not allowed from RCU callbacks.
      
      We deal with this by using schedule_work().
      
      Since the ring buffer is now completely linear even on the kernel side,
      several simplifications are probably now possible in the code where we add
      entries to the ring.
      
      With help from Peter Zijlstra.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9ba02e11