• Jeff Moyer's avatar
    Intel reported a performance regression caused by the following commit: · b7f890ec
    Jeff Moyer authored
    commit 848c4dd5
    Author: Zach Brown <zach.brown@oracle.com>
    Date:   Mon Aug 20 17:12:01 2007 -0700
    
        dio: zero struct dio with kzalloc instead of manually
    
        This patch uses kzalloc to zero all of struct dio rather than
        manually trying to track which fields we rely on being zero.  It
        passed aio+dio stress testing and some bug regression testing on
        ext3.
    
        This patch was introduced by Linus in the conversation that lead up
        to Badari's minimal fix to manually zero .map_bh.b_state in commit:
    
          6a648fa7
    
        It makes the code a bit smaller.  Maybe a couple fewer cachelines to
        load, if we're lucky:
    
           text    data     bss     dec     hex filename
        3285925  568506 1304616 5159047  4eb887 vmlinux
        3285797  568506 1304616 5158919  4eb807 vmlinux.patched
    
        I was unable to measure a stable difference in the number of cpu
        cycles spent in blockdev_direct_IO() when pushing aio+dio 256K reads
        at ~340MB/s.
    
        So the resulting intent of the patch isn't a performance gain but to
        avoid exposing ourselves to the risk of finding another field like
        .map_bh.b_state where we rely on zeroing but don't enforce it in the
        code.
    
    Zach surmised that zeroing out the page array was what caused most of
    the problem, and suggested the approach taken in the attached patch for
    resolving the issue.  Intel re-tested with this patch and saw a 0.6%
    performance gain (the original regression was 0.5%).
    Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
    Acked-by: default avatarZach Brown <zach.brown@oracle.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    b7f890ec
direct-io.c 34.6 KB