• Lai Jiangshan's avatar
    rcu classic: new algorithm for callbacks-processing(v2) · 5127bed5
    Lai Jiangshan authored
    This is v2, it's a little deference from v1 that I
    had send to lkml.
    use ACCESS_ONCE
    use rcu_batch_after/rcu_batch_before for batch # comparison.
    
    rcutorture test result:
    (hotplugs: do cpu-online/offline once per second)
    
    No CONFIG_NO_HZ:           OK, 12hours
    No CONFIG_NO_HZ, hotplugs: OK, 12hours
    CONFIG_NO_HZ=y:            OK, 24hours
    CONFIG_NO_HZ=y, hotplugs:  Failed.
    (Failed also without my patch applied, exactly the same bug occurred,
    http://lkml.org/lkml/2008/7/3/24)
    
    v1's email thread:
    http://lkml.org/lkml/2008/6/2/539
    
    v1's description:
    
    The code/algorithm of the implement of current callbacks-processing
    is very efficient and technical. But when I studied it and I found
    a disadvantage:
    
    In multi-CPU systems, when a new RCU callback is being
    queued(call_rcu[_bh]), this callback will be invoked after the grace
    period for the batch with batch number = rcp->cur+2 has completed
    very very likely in current implement. Actually, this callback can be
    invoked after the grace period for the batch with
    batch number = rcp->cur+1 has completed. The delay of invocation means
    that latency of synchronize_rcu() is extended. But more important thing
    is that the callbacks usually free memory, and these works are delayed
    too! it's necessary for reclaimer to free memory as soon as
    possible when left memory is few.
    
    A very simple way can solve this problem:
    a field(struct rcu_head::batch) is added to record the batch number for
    the RCU callback. And when a new RCU callback is being queued, we
    determine the batch number for this callback(head->batch = rcp->cur+1)
    and we move this callback to rdp->donelist if we find
    that head->batch <= rcp->completed when we process callbacks.
    This simple way reduces the wait time for invocation a lot. (about
    2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems)
    
    This is my algorithm. But I do not add any field for struct rcu_head
    in my implement. We just need to memorize the last 2 batches and
    their batch number, because these 2 batches include all entries that
    for whom the grace period hasn't completed. So we use a special
    linked-list rather than add a field.
    Please see the comment of struct rcu_data.
    Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
    Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
    Cc: Dipankar Sarma <dipankar@in.ibm.com>
    Cc: Gautham Shenoy <ego@in.ibm.com>
    Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    5127bed5
rcuclassic.c 18.3 KB