• Linus Torvalds's avatar
    x86-64: support native xadd rwsem implementation · bafaecd1
    Linus Torvalds authored
    This one is much faster than the spinlock based fallback rwsem code,
    with certain artifical benchmarks having shown 300%+ improvement on
    threaded page faults etc.
    
    Again, note the 32767-thread limit here. So this really does need that
    whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
    extension on top of it, but that is conceptually a totally independent
    issue.
    
    NOT TESTED! The original patch that this all was based on were tested by
    KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
    cleaned-up series, so caveat emptor..
    
    Also note that it _may_ be a good idea to mark some more registers
    clobbered on x86-64 in the inline asms instead of saving/restoring them.
    They are inline functions, but they are only used in places where there
    are not a lot of live registers _anyway_, so doing for example the
    clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
    worse, and would make the slow-path code smaller.
    
    (Not that the slow-path really matters to that degree. Saving a few
    unnecessary registers is the _least_ of our problems when we hit the slow
    path. The instruction/cycle counting really only matters in the fast
    path).
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain>
    Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
    bafaecd1
rwsem_64.S 1.74 KB