1. 20 Jul, 2009 1 commit
    • Paul E. McKenney's avatar
      RCU: QRCU with lockless fastpath · 33a93eaa
      Paul E. McKenney authored
      Hello!
      
      This is an updated version of Oleg Nesterov's QRCU that avoids the
      earlier lock acquisition on the synchronize_qrcu() fastpath.  This passes
      rcutorture on x86 and the weakly ordered POWER.  A promela model of the
      code passes as noted before for 2 readers and 3 updaters and for 3 readers
      and 2 updaters.  3 readers and 3 updaters runs every machine that I have
      access to out of memory -- nothing like a little combinatorial explosion!
      However, after some thought, the proof ended up being simple enough:
      
      1.	If synchronize_qrcu() exits too soon, then by definition
      	there has been a reader present during synchronize_srcu()'s
      	full execution.
      
      2.	The counter corresponding to this reader will be at least
      	1 at all times.
      
      3.	The synchronize_qrcu() code forces at least one of the counters
      	to be at least one at all times -- if there is a reader, the
      	sum will be at least two.  (Unfortunately, we cannot fetch
      	the pair of counters atomically.)
      
      4.	Therefore, the only way that synchronize_qrcu()s fastpath can
      	see a sum of 1 is if it races with another synchronize_qrcu() --
      	the first synchronize_qrcu() must read one of the counters before
      	the second synchronize_qrcu() increments it, and must read the
      	other counter after the second synchronize_qrcu() decrements it.
      	There can be at most one reader present through this entire
      	operation -- otherwise, the first synchronize_qrcu() will see
      	a sum of 2 or greater.
      
      5.	But the second synchronize_qrcu() will not release the mutex
      	until after the reader is done.  During this time, the first
      	synchronize_qrcu() will always see a sum of at least 2, and
      	therefore cannot take the remainder of the fastpath until the
      	reader is done.
      
      6.	Because the second synchronize_qrcu() holds the mutex, no other
      	synchronize_qrcu() can manipulate the counters until the reader
      	is done.  A repeat of the race called out in #4 above therefore
      	cannot happen until after the reader is done, in which case it
      	is safe for the first synchronize_qrcu() to proceed.
      
      Therefore, two summations of the counter separated by a memory barrier
      suffices and the implementation shown below also suffices.
      
      (And, yes, the fastpath -could- check for a sum of zero and exit
      immediately, but this would help only in case of a three-way race
      between two synchronize_qrcu()s and a qrcu_read_unlock(), would add
      another compare, so is not worth it.)
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      33a93eaa
  2. 18 Jul, 2009 1 commit
  3. 17 Jul, 2009 17 commits
  4. 16 Jul, 2009 15 commits
  5. 15 Jul, 2009 6 commits