1. 04 Dec, 2008 1 commit
    • john stultz's avatar
      time: catch xtime_nsec underflows and fix them · 6c9bacb4
      john stultz authored
      Impact: fix time warp bug
      
      Alex Shi, along with Yanmin Zhang have been noticing occasional time
      inconsistencies recently. Through their great diagnosis, they found that
      the xtime_nsec value used in update_wall_time was occasionally going
      negative. After looking through the code for awhile, I realized we have
      the possibility for an underflow when three conditions are met in
      update_wall_time():
      
        1) We have accumulated a second's worth of nanoseconds, so we
           incremented xtime.tv_sec and appropriately decrement xtime_nsec.
           (This doesn't cause xtime_nsec to go negative, but it can cause it
            to be small).
      
        2) The remaining offset value is large, but just slightly less then
           cycle_interval.
      
        3) clocksource_adjust() is speeding up the clock, causing a
           corrective amount (compensating for the increase in the multiplier
           being multiplied against the unaccumulated offset value) to be
           subtracted from xtime_nsec.
      
      This can cause xtime_nsec to underflow.
      
      Unfortunately, since we notify the NTP subsystem via second_overflow()
      whenever we accumulate a full second, and this effects the error
      accumulation that has already occured, we cannot simply revert the
      accumulated second from xtime nor move the second accumulation to after
      the clocksource_adjust call without a change in behavior.
      
      This leaves us with (at least) two options:
      
      1) Simply return from clocksource_adjust() without making a change if we
         notice the adjustment would cause xtime_nsec to go negative.
      
      This would work, but I'm concerned that if a large adjustment was needed
      (due to the error being large), it may be possible to get stuck with an
      ever increasing error that becomes too large to correct (since it may
      always force xtime_nsec negative). This may just be paranoia on my part.
      
      2) Catch xtime_nsec if it is negative, then add back the amount its
         negative to both xtime_nsec and the error.
      
      This second method is consistent with how we've handled earlier rounding
      issues, and also has the benefit that the error being added is always in
      the oposite direction also always equal or smaller then the correction
      being applied. So the risk of a corner case where things get out of
      control is lessened.
      
      This patch fixes bug 11970, as tested by Yanmin Zhang
      http://bugzilla.kernel.org/show_bug.cgi?id=11970
      
      Reported-by: alex.shi@intel.com
      Signed-off-by: default avatarJohn Stultz <johnstul@us.ibm.com>
      Acked-by: default avatar"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Tested-by: default avatar"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6c9bacb4
  2. 24 Nov, 2008 1 commit
  3. 20 Nov, 2008 38 commits