Commit ac08c264 authored by Thomas Gleixner's avatar Thomas Gleixner Committed by Linus Torvalds

[PATCH] posix-cpu-timers: prevent signal delivery starvation

The integer divisions in the timer accounting code can round the result
down to 0.  Adding 0 is without effect and the signal delivery stops.

Clamp the division result to minimum 1 to avoid this.

Problem was reported by Seongbae Park <spark@google.com>, who provided
also an inital patch.

Roland sayeth:

  I have had some more time to think about the problem, and to reproduce it
  using Toyo's test case.  For the record, if my understanding of the problem
  is correct, this happens only in one very particular case.  First, the
  expiry time has to be so soon that in cputime_t units (usually 1s/HZ ticks)
  it's < nthreads so the division yields zero.  Second, it only affects each
  thread that is so new that its CPU time accumulation is zero so now+0 is
  still zero and ->it_*_expires winds up staying zero.  For the VIRT and PROF
  clocks when cputime_t is tick granularity (or the SCHED clock on
  configurations where sched_clock's value only advances on clock ticks), this
  is not hard to arrange with new threads starting up and blocking before they
  accumulate a whole tick of CPU time.  That's what happens in Toyo's test
  case.

  Note that in general it is fine for that division to round down to zero,
  and set each thread's expiry time to its "now" time.  The problem only
  arises with thread's whose "now" value is still zero, so that now+0 winds up
  0 and is interpreted as "not set" instead of ">= now".  So it would be a
  sufficient and more precise fix to just use max(ticks, 1) inside the loop
  when setting each it_*_expires value.

  But, it does no harm to round the division up to one and always advance
  every thread's expiry time.  If the thread didn't already fire timers for
  the expiry time of "now", there is no expectation that it will do so before
  the next tick anyway.  So I followed Thomas's patch in lifting the max out
  of the loops.

  This patch also covers the reload cases, which are harder to write a test
  for (and I didn't try).  I've tested it with Toyo's case and it fixes that.

[toyoa@mvista.com: fix: min_t -> max_t]
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
Cc: Daniel Walker <dwalker@mvista.com>
Cc: Toyo Abe <toyoa@mvista.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Seongbae Park <spark@google.com>
Cc: Peter Mattis <pmattis@google.com>
Cc: Rohit Seth <rohitseth@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: <stable@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent e24650c2
...@@ -87,6 +87,19 @@ static inline union cpu_time_count cpu_time_sub(const clockid_t which_clock, ...@@ -87,6 +87,19 @@ static inline union cpu_time_count cpu_time_sub(const clockid_t which_clock,
return a; return a;
} }
/*
* Divide and limit the result to res >= 1
*
* This is necessary to prevent signal delivery starvation, when the result of
* the division would be rounded down to 0.
*/
static inline cputime_t cputime_div_non_zero(cputime_t time, unsigned long div)
{
cputime_t res = cputime_div(time, div);
return max_t(cputime_t, res, 1);
}
/* /*
* Update expiry time from increment, and increase overrun count, * Update expiry time from increment, and increase overrun count,
* given the current clock sample. * given the current clock sample.
...@@ -483,7 +496,7 @@ static void process_timer_rebalance(struct task_struct *p, ...@@ -483,7 +496,7 @@ static void process_timer_rebalance(struct task_struct *p,
BUG(); BUG();
break; break;
case CPUCLOCK_PROF: case CPUCLOCK_PROF:
left = cputime_div(cputime_sub(expires.cpu, val.cpu), left = cputime_div_non_zero(cputime_sub(expires.cpu, val.cpu),
nthreads); nthreads);
do { do {
if (likely(!(t->flags & PF_EXITING))) { if (likely(!(t->flags & PF_EXITING))) {
...@@ -498,7 +511,7 @@ static void process_timer_rebalance(struct task_struct *p, ...@@ -498,7 +511,7 @@ static void process_timer_rebalance(struct task_struct *p,
} while (t != p); } while (t != p);
break; break;
case CPUCLOCK_VIRT: case CPUCLOCK_VIRT:
left = cputime_div(cputime_sub(expires.cpu, val.cpu), left = cputime_div_non_zero(cputime_sub(expires.cpu, val.cpu),
nthreads); nthreads);
do { do {
if (likely(!(t->flags & PF_EXITING))) { if (likely(!(t->flags & PF_EXITING))) {
...@@ -515,6 +528,7 @@ static void process_timer_rebalance(struct task_struct *p, ...@@ -515,6 +528,7 @@ static void process_timer_rebalance(struct task_struct *p,
case CPUCLOCK_SCHED: case CPUCLOCK_SCHED:
nsleft = expires.sched - val.sched; nsleft = expires.sched - val.sched;
do_div(nsleft, nthreads); do_div(nsleft, nthreads);
nsleft = max_t(unsigned long long, nsleft, 1);
do { do {
if (likely(!(t->flags & PF_EXITING))) { if (likely(!(t->flags & PF_EXITING))) {
ns = t->sched_time + nsleft; ns = t->sched_time + nsleft;
...@@ -1159,12 +1173,13 @@ static void check_process_timers(struct task_struct *tsk, ...@@ -1159,12 +1173,13 @@ static void check_process_timers(struct task_struct *tsk,
prof_left = cputime_sub(prof_expires, utime); prof_left = cputime_sub(prof_expires, utime);
prof_left = cputime_sub(prof_left, stime); prof_left = cputime_sub(prof_left, stime);
prof_left = cputime_div(prof_left, nthreads); prof_left = cputime_div_non_zero(prof_left, nthreads);
virt_left = cputime_sub(virt_expires, utime); virt_left = cputime_sub(virt_expires, utime);
virt_left = cputime_div(virt_left, nthreads); virt_left = cputime_div_non_zero(virt_left, nthreads);
if (sched_expires) { if (sched_expires) {
sched_left = sched_expires - sched_time; sched_left = sched_expires - sched_time;
do_div(sched_left, nthreads); do_div(sched_left, nthreads);
sched_left = max_t(unsigned long long, sched_left, 1);
} else { } else {
sched_left = 0; sched_left = 0;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment