1. 24 Jul, 2009 1 commit
  2. 23 Jul, 2009 1 commit
  3. 24 Aug, 2009 1 commit
    • Neil Horman's avatar
      The user mode helper code has a race in it. call_usermodehelper_exec() · 9512a48b
      Neil Horman authored
      takes an allocated subprocess_info structure, which it passes to a
      workqueue, and then passes it to a kernel thread which it creates, after
      which it calls complete to signal to the caller of
      call_usermodehelper_exec() that it can free the subprocess_info struct.
      
      But since we use that structure in the created thread, we can't call
      complete from __call_usermodehelper(), which is where we create the kernel
      thread.  We need to call complete() from within the kernel thread and then
      not use subprocess_info afterward in the case of UMH_WAIT_EXEC.  Tested
      successfully by me.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9512a48b
  4. 29 Jun, 2009 2 commits
  5. 24 Aug, 2009 7 commits
  6. 13 Aug, 2009 1 commit
  7. 23 Jul, 2009 1 commit
  8. 13 Aug, 2009 1 commit
  9. 23 Jul, 2009 1 commit
  10. 13 Aug, 2009 1 commit
  11. 23 Jul, 2009 1 commit
  12. 13 Jul, 2009 1 commit
  13. 13 Aug, 2009 1 commit
  14. 12 Aug, 2009 1 commit
  15. 31 Jul, 2009 1 commit
  16. 23 Jul, 2009 1 commit
  17. 13 Aug, 2009 1 commit
  18. 04 Aug, 2009 1 commit
  19. 23 Jul, 2009 1 commit
  20. 13 Aug, 2009 1 commit
  21. 14 Feb, 2009 2 commits
  22. 13 Sep, 2009 1 commit
  23. 14 Sep, 2009 1 commit
    • Wu Fengguang's avatar
      > @@ -547,20 +541,20 @@ static ssize_t write_kmem(struct file * · 7f61d18b
      Wu Fengguang authored
      >  		if (!kbuf)
      >  			return wrote ? wrote : -ENOMEM;
      >  		while (count > 0) {
      > -			int len = size_inside_page(p, count);
      > +			unsigned long sz = size_inside_page(p, count);
      >
      > -			written = copy_from_user(kbuf, buf, len);
      > -			if (written) {
      > +			sz = copy_from_user(kbuf, buf, sz);
      
      Sorry, it introduced a bug: the "sz" will be zero in normal,
      
      > +			if (sz) {
      >  				if (wrote + virtr)
      >  					break;
      >  				free_page((unsigned long)kbuf);
      >  				return -EFAULT;
      >  			}
      > -			len = vwrite(kbuf, (char *)p, len);
      > +			sz = vwrite(kbuf, (char *)p, sz);
      
      and get passed to vwrite here.
      
      This patch fixes it, the new var "n" will be used in another bug
      fixing patch following this one.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7f61d18b
  24. 13 Sep, 2009 2 commits
  25. 14 Sep, 2009 1 commit
  26. 12 Sep, 2009 5 commits
  27. 11 Sep, 2009 1 commit
    • Lee Schermerhorn's avatar
      We noticed very erratic behavior [throughput] with the AIM7 shared · 87e1a47d
      Lee Schermerhorn authored
      workload running on recent distro [SLES11] and mainline kernels on an
      8-socket, 32-core, 256GB x86_64 platform.  On the SLES11 kernel
      [2.6.27.19+] with Barcelona processors, as we increased the load [10s of
      thousands of tasks], the throughput would vary between two "plateaus"--one
      at ~65K jobs per minute and one at ~130K jpm.  The simple patch below
      causes the results to smooth out at the ~130k plateau.
      
      But wait, there's more:
      
      We do not see this behavior on smaller platforms--e.g., 4 socket/8 core. 
      This could be the result of the larger number of cpus on the larger
      platform--a scalability issue--or it could be the result of the larger
      number of interconnect "hops" between some nodes in this platform and how
      the tasks for a given load end up distributed over the nodes' cpus and
      memories--a stochastic NUMA effect.
      
      The variability in the results are less pronounced [on the same platform]
      with Shanghai processors and with mainline kernels.  With 31-rc6 on
      Shanghai processors and 288 file systems on 288 fibre attached storage
      volumes, the curves [jpm vs load] are both quite flat with the patched
      kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
      jpm] than the unpatched kernel.
      
      Profiling indicated that the "slow" runs were incurring high[er]
      contention on an anon_vma lock in vma_adjust(), apparently called from the
      sbrk() system call.
      
      The patch:
      
      A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
      anon_vma lock when we're only adjusting the end of a vma, as is the case
      for brk().  The comment questions whether it's worth while to optimize for
      this case.  Apparently, on the newer, larger x86_64 platforms, with
      interesting NUMA topologies, it is worth while--especially considering
      that the patch [if correct!] is quite simple.
      
      We can detect this condition--no overlap with next vma--by noting a NULL
      "importer".  The anon_vma pointer will also be NULL in this case, so
      simply avoid loading vma->anon_vma to avoid the lock.  However, we
      apparently DO need to take the anon_vma lock when we're inserting a vma
      ['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
      need to check for 'insert', as well.
      
      I have tested with and without the 'file || ' test in the patch.  This
      does not seem to matter for stability nor performance.  I left this
      check/filter in, so we only optimize away the anon_vma lock acquisition
      when adjusting the end of a non- importing, non-inserting, anon vma.
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Eric Whitney <eric.whitney@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      87e1a47d