• Zach Brown's avatar
    [PATCH] dio: centralize completion in dio_complete() · 6d544bb4
    Zach Brown authored
    There have been a lot of bugs recently due to the way direct_io_worker() tries
    to decide how to finish direct IO operations.  In the worst examples it has
    failed to call aio_complete() at all (hang) or called it too many times
    (oops).
    
    This set of patches cleans up the completion phase with the goal of removing
    the complexity that lead to these bugs.  We end up with one path that
    calculates the result of the operation after all off the bios have completed.
    We decide when to generate a result of the operation using that path based on
    the final release of a refcount on the dio structure.
    
    I tried to progress towards the final state in steps that were relatively easy
    to understand.  Each step should compile but I only tested the final result of
    having all the patches applied.
    
    I've tested these on low end PC drives with aio-stress, the direct IO tests I
    could manage to get running in LTP, orasim, and some home-brew functional
    tests.
    
    In http://lkml.org/lkml/2006/9/21/103 IBM reports success with ext2 and ext3
    running DIO LTP tests.  They found that XFS bug which has since been addressed
    in the patch series.
    
    This patch:
    
    The mechanics which decide the result of a direct IO operation were duplicated
    in the sync and async paths.
    
    The async path didn't check page_errors which can manifest as silently
    returning success when the final pointer in an operation faults and its
    matching file region is filled with zeros.
    
    The sync path and async path differed in whether they passed errors to the
    caller's dio->end_io operation.  The async path was passing errors to it which
    trips an assertion in XFS, though it is apparently harmless.
    
    This centralizes the completion phase of dio ops in one place.  AIO will now
    return EFAULT consistently and all paths fall back to the previously sync
    behaviour of passing the number of bytes 'transferred' to the dio->end_io
    callback, regardless of errors.
    
    dio_await_completion() doesn't have to propogate EIO from non-uptodate bios
    now that it's being propogated through dio_complete() via dio->io_error.  This
    lets it return void which simplifies its sole caller.
    Signed-off-by: default avatarZach Brown <zach.brown@oracle.com>
    Cc: Badari Pulavarty <pbadari@us.ibm.com>
    Cc: Suparna Bhattacharya <suparna@in.ibm.com>
    Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    6d544bb4
direct-io.c 35 KB