An error occurred fetching the project authors.
  1. 23 Oct, 2009 1 commit
    • Mike Galbraith's avatar
      sched: Strengthen buddies and mitigate buddy induced latencies · f685ceac
      Mike Galbraith authored
      This patch restores the effectiveness of LAST_BUDDY in preventing
      pgsql+oltp from collapsing due to wakeup preemption. It also
      switches LAST_BUDDY to exclusively do what it does best, namely
      mitigate the effects of aggressive wakeup preemption, which
      improves vmark throughput markedly, and restores mysql+oltp
      scalability.
      
      Since buddies are about scalability, enable them beginning at the
      point where we begin expanding sched_latency, namely
      sched_nr_latency. Previously, buddies were cleared aggressively,
      which seriously reduced their effectiveness. Not clearing
      aggressively however, produces a small drop in mysql+oltp
      throughput immediately after peak, indicating that LAST_BUDDY is
      actually doing some harm. This is right at the point where X on the
      desktop in competition with another load wants low latency service.
      Ergo, do not enable until we need to scale.
      
      To mitigate latency induced by buddies, or by a task just missing
      wakeup preemption, check latency at tick time.
      
      Last hunk prevents buddies from stymieing BALANCE_NEWIDLE via
      CACHE_HOT_BUDDY.
      
      Supporting performance tests:
      
       tip   = v2.6.32-rc5-1497-ga525b32
       tipx  = NO_GENTLE_FAIR_SLEEPERS NEXT_BUDDY granularity knobs = 31 knobs + 31 buddies
       tip+x = NO_GENTLE_FAIR_SLEEPERS granularity knobs = 31 knobs
      
      (Three run averages except where noted.)
      
       vmark:
       ------
       tip           108466 messages per second
       tip+          125307 messages per second
       tip+x         125335 messages per second
       tipx          117781 messages per second
       2.6.31.3      122729 messages per second
      
       mysql+oltp:
       -----------
       clients          1        2        4        8       16       32       64        128    256
       ..........................................................................................
       tip        9949.89 18690.20 34801.24 34460.04 32682.88 30765.97 28305.27 25059.64 19548.08
       tip+      10013.90 18526.84 34900.38 34420.14 33069.83 32083.40 30578.30 28010.71 25605.47
       tipx       9698.71 18002.70 34477.56 33420.01 32634.30 31657.27 29932.67 26827.52 21487.18
       2.6.31.3   8243.11 18784.20 34404.83 33148.38 31900.32 31161.90 29663.81 25995.94 18058.86
      
       pgsql+oltp:
       -----------
       clients          1        2        4        8       16       32       64      128      256
       ..........................................................................................
       tip       13686.37 26609.25 51934.28 51347.81 49479.51 45312.65 36691.91 26851.57 24145.35
       tip+ (1x) 13907.85 27135.87 52951.98 52514.04 51742.52 50705.43 49947.97 48374.19 46227.94
       tip+x     13906.78 27065.81 52951.19 52542.59 52176.11 51815.94 50838.90 49439.46 46891.00
       tipx      13742.46 26769.81 52351.99 51891.73 51320.79 50938.98 50248.65 48908.70 46553.84
       2.6.31.3  13815.35 26906.46 52683.34 52061.31 51937.10 51376.80 50474.28 49394.47 47003.25
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f685ceac
  2. 14 Oct, 2009 1 commit
    • Peter Zijlstra's avatar
      sched: Do less agressive buddy clearing · 92f6a5e3
      Peter Zijlstra authored
      Yanmin reported a hackbench regression due to:
      
       > commit de69a80b
       > Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
       > Date:   Thu Sep 17 09:01:20 2009 +0200
       >
       >     sched: Stop buddies from hogging the system
      
      I really liked de69a80b, and it affecting hackbench shows I wasn't
      crazy ;-)
      
      So hackbench is a multi-cast, with one sender spraying multiple
      receivers, who in their turn don't spray back.
      
      This would be exactly the scenario that patch 'cures'. Previously
      we would not clear the last buddy after running the next task,
      allowing the sender to get back to work sooner than it otherwise
      ought to have been, increasing latencies for other tasks.
      
      Now, since those receivers don't poke back, they don't enforce the
      buddy relation, which means there's nothing to re-elect the sender.
      
      Cure this by less agressively clearing the buddy stats. Only clear
      buddies when they were not chosen. It should still avoid a buddy
      sticking around long after its served its time.
      Reported-by: default avatar"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1255084986.8802.46.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      92f6a5e3
  3. 24 Sep, 2009 1 commit
  4. 21 Sep, 2009 1 commit
  5. 19 Sep, 2009 1 commit
  6. 18 Sep, 2009 1 commit
  7. 17 Sep, 2009 3 commits
    • Peter Zijlstra's avatar
      sched: Fix SD_POWERSAVING_BALANCE|SD_PREFER_LOCAL vs SD_WAKE_AFFINE · 29cd8bae
      Peter Zijlstra authored
      The SD_POWERSAVING_BALANCE|SD_PREFER_LOCAL code can break out of
      the domain iteration early, making us miss the SD_WAKE_AFFINE bits.
      
      Fix this by continuing iteration until there is no need for a
      larger domain.
      
      This also cleans up the cgroup stuff a bit, but not having two
      update_shares() invocations.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      29cd8bae
    • Peter Zijlstra's avatar
      sched: Stop buddies from hogging the system · de69a80b
      Peter Zijlstra authored
      Clear buddies more agressively.
      
      The (theoretical, haven't actually observed any of this) problem is
      that when we do not select either buddy in pick_next_entity()
      because they are too far ahead of the left-most task, we do not
      clear the buddies.
      
      This means that as soon as we service the left-most task, these
      same buddies will be tried again on the next schedule. Now if the
      left-most task was a pure hog, it wouldn't have done any wakeups
      and it wouldn't have set buddies of its own. That leads to the old
      buddies dominating, which would lead to bad latencies.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      de69a80b
    • Peter Zijlstra's avatar
      sched: Add new wakeup preemption mode: WAKEUP_RUNNING · ad4b78bb
      Peter Zijlstra authored
      Create a new wakeup preemption mode, preempt towards tasks that run
      shorter on avg. It sets next buddy to be sure we actually run the task
      we preempted for.
      
      Test results:
      
       root@twins:~# while :; do :; done &
       [1] 6537
       root@twins:~# while :; do :; done &
       [2] 6538
       root@twins:~# while :; do :; done &
       [3] 6539
       root@twins:~# while :; do :; done &
       [4] 6540
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max          4750 usec
              Avg           497 usec
              Stdev         737 usec
      
       root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max            14 usec
              Avg             5 usec
              Stdev           3 usec
      
      Disabled by default - needs more testing.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      LKML-Reference: <new-submission>
      ad4b78bb
  8. 16 Sep, 2009 5 commits
  9. 15 Sep, 2009 13 commits
  10. 13 Sep, 2009 1 commit
    • Ingo Molnar's avatar
      perf_counter, sched: Add sched_stat_runtime tracepoint · f977bb49
      Ingo Molnar authored
      This allows more precise tracking of how the scheduler accounts
      (and acts upon) a task having spent N nanoseconds of CPU time.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f977bb49
  11. 10 Sep, 2009 1 commit
    • Ingo Molnar's avatar
      sched: Fix sched::sched_stat_wait tracepoint field · e1f84508
      Ingo Molnar authored
      This weird perf trace output:
      
        cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]
      
      Is caused by setting one component field of the delta to zero
      a bit too early. Move it to later.
      
      ( Note, this does not affect the NEW_FAIR_SLEEPERS interactivity bug,
        it's just a reporting bug in essence. )
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Nikos Chantziaras <realnc@arcor.de>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <4AA93D34.8040500@arcor.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e1f84508
  12. 09 Sep, 2009 2 commits
  13. 08 Sep, 2009 1 commit
    • Mike Galbraith's avatar
      sched: Ensure that a child can't gain time over it's parent after fork() · b5d9d734
      Mike Galbraith authored
      A fork/exec load is usually "pass the baton", so the child
      should never be placed behind the parent.  With START_DEBIT we
      make room for the new task, but with child_runs_first, that
      room comes out of the _parent's_ hide. There's nothing to say
      that the parent wasn't ahead of min_vruntime at fork() time,
      which means that the "baton carrier", who is essentially the
      parent in drag, can gain time and increase scheduling latencies
      for waiters.
      
      With NEW_FAIR_SLEEPERS + START_DEBIT + child_runs_first
      enabled, we essentially pass the sleeper fairness off to the
      child, which is fine, but if we don't base placement on the
      parent's updated vruntime, we can end up compounding latency
      woes if the child itself then does fork/exec.  The debit
      incurred at fork doesn't hurt the parent who is then going to
      sleep and maybe exit, but the child who acquires the error
      harms all comers.
      
      This improves latencies of make -j<n> kernel build workloads.
      Reported-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b5d9d734
  14. 07 Sep, 2009 2 commits
  15. 02 Sep, 2009 2 commits
  16. 02 Aug, 2009 3 commits
  17. 18 Jul, 2009 1 commit