• Eric Paris's avatar
    fsnotify: unified filesystem notification backend · 90586523
    Eric Paris authored
    fsnotify is a backend for filesystem notification.  fsnotify does
    not provide any userspace interface but does provide the basis
    needed for other notification schemes such as dnotify.  fsnotify
    can be extended to be the backend for inotify or the upcoming
    fanotify.  fsnotify provides a mechanism for "groups" to register for
    some set of filesystem events and to then deliver those events to
    those groups for processing.
    
    fsnotify has a number of benefits, the first being actually shrinking the size
    of an inode.  Before fsnotify to support both dnotify and inotify an inode had
    
            unsigned long           i_dnotify_mask; /* Directory notify events */
            struct dnotify_struct   *i_dnotify; /* for directory notifications */
            struct list_head        inotify_watches; /* watches on this inode */
            struct mutex            inotify_mutex;  /* protects the watches list
    
    But with fsnotify this same functionallity (and more) is done with just
    
            __u32                   i_fsnotify_mask; /* all events for this inode */
            struct hlist_head       i_fsnotify_mark_entries; /* marks on this inode */
    
    That's right, inotify, dnotify, and fanotify all in 64 bits.  We used that
    much space just in inotify_watches alone, before this patch set.
    
    fsnotify object lifetime and locking is MUCH better than what we have today.
    inotify locking is incredibly complex.  See 8f7b0ba1 as an example of
    what's been busted since inception.  inotify needs to know internal semantics
    of superblock destruction and unmounting to function.  The inode pinning and
    vfs contortions are horrible.
    
    no fsnotify implementers do allocation under locks.  This means things like
    f04b30de which (due to an overabundance of caution) changes GFP_KERNEL to
    GFP_NOFS can be reverted.  There are no longer any allocation rules when using
    or implementing your own fsnotify listener.
    
    fsnotify paves the way for fanotify.  In brief fanotify is a notification
    mechanism that delivers the lisener both an 'event' and an open file descriptor
    to the object in question.  This means that fanotify is pathname agnostic.
    Some on lkml may not care for the original companies or users that pushed for
    TALPA, but fanotify was designed with flexibility and input for other users in
    mind.  The readahead group expressed interest in fanotify as it could be used
    to profile disk access on boot without breaking the audit system.  The desktop
    search groups have also expressed interest in fanotify as it solves a number
    of the race conditions and problems present with managing inotify when more
    than a limited number of specific files are of interest.  fanotify can provide
    for a userspace access control system which makes it a clean interface for AV
    vendors to hook without trying to do binary patching on the syscall table,
    LSM, and everywhere else they do their things today.  With this patch series
    fanotify can be implemented in less than 1200 lines of easy to review code.
    Almost all of which is the socket based user interface.
    
    This patch series builds fsnotify to the point that it can implement
    dnotify and inotify_user.  Patches exist and will be sent soon after
    acceptance to finish the in kernel inotify conversion (audit) and implement
    fanotify.
    Signed-off-by: default avatarEric Paris <eparis@redhat.com>
    Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    Cc: Christoph Hellwig <hch@lst.de>
    90586523
group.c 5.45 KB