• Al Viro's avatar
    Fix inotify watch removal/umount races · 8f7b0ba1
    Al Viro authored
    Inotify watch removals suck violently.
    
    To kick the watch out we need (in this order) inode->inotify_mutex and
    ih->mutex.  That's fine if we have a hold on inode; however, for all
    other cases we need to make damn sure we don't race with umount.  We can
    *NOT* just grab a reference to a watch - inotify_unmount_inodes() will
    happily sail past it and we'll end with reference to inode potentially
    outliving its superblock.
    
    Ideally we just want to grab an active reference to superblock if we
    can; that will make sure we won't go into inotify_umount_inodes() until
    we are done.  Cleanup is just deactivate_super().
    
    However, that leaves a messy case - what if we *are* racing with
    umount() and active references to superblock can't be acquired anymore?
    We can bump ->s_count, grab ->s_umount, which will almost certainly wait
    until the superblock is shut down and the watch in question is pining
    for fjords.  That's fine, but there is a problem - we might have hit the
    window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
    the moment when superblock is past the point of no return and is heading
    for shutdown) and the moment when deactivate_super() acquires
    ->s_umount.
    
    We could just do drop_super() yield() and retry, but that's rather
    antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
    ->s_umount and having found that we'd got there first (i.e.  that
    ->s_root is non-NULL) we know that we won't race with
    inotify_umount_inodes().
    
    So we could grab a reference to watch and do the rest as above, just
    with drop_super() instead of deactivate_super(), right? Wrong.  We had
    to drop ih->mutex before we could grab ->s_umount.  So the watch
    could've been gone already.
    
    That still can be dealt with - we need to save watch->wd, do idr_find()
    and compare its result with our pointer.  If they match, we either have
    the damn thing still alive or we'd lost not one but two races at once,
    the watch had been killed and a new one got created with the same ->wd
    at the same address.  That couldn't have happened in inotify_destroy(),
    but inotify_rm_wd() could run into that.  Still, "new one got created"
    is not a problem - we have every right to kill it or leave it alone,
    whatever's more convenient.
    
    So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
    "grab it and kill it" check.  If it's been our original watch, we are
    fine, if it's a newcomer - nevermind, just pretend that we'd won the
    race and kill the fscker anyway; we are safe since we know that its
    superblock won't be going away.
    
    And yes, this is far beyond mere "not very pretty"; so's the entire
    concept of inotify to start with.
    Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    Acked-by: default avatarGreg KH <greg@kroah.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    8f7b0ba1
audit_tree.c 20.9 KB