• Roland Dreier's avatar
    ummunotify: Userspace support for MMU notifications · 16e36679
    Roland Dreier authored
    As discussed in <http://article.gmane.org/gmane.linux.drivers.openib/61925>
    and follow-up messages, libraries using RDMA would like to track
    precisely when application code changes memory mapping via free(),
    munmap(), etc.  Current pure-userspace solutions using malloc hooks
    and other tricks are not robust, and the feeling among experts is that
    the issue is unfixable without kernel help.
    
    We solve this not by implementing the full API proposed in the email
    linked above but rather with a simpler and more generic interface,
    which may be useful in other contexts.  Specifically, we implement a
    new character device driver, ummunotify, that creates a /dev/ummunotify
    node.  A userspace process can open this node read-only and use the fd
    as follows:
    
     1. ioctl() to register/unregister an address range to watch in the
        kernel (cf struct ummunotify_register_ioctl in <linux/ummunotify.h>).
    
     2. read() to retrieve events generated when a mapping in a watched
        address range is invalidated (cf struct ummunotify_event in
        <linux/ummunotify.h>).  select()/poll()/epoll() and SIGIO are
        handled for this IO.
    
     3. mmap() one page at offset 0 to map a kernel page that contains a
        generation counter that is incremented each time an event is
        generated.  This allows userspace to have a fast path that checks
        that no events have occurred without a system call.
    
    Thanks to Jason Gunthorpe <jgunthorpe@obsidianresearch.com> for
    suggestions on the interface design.  Also thanks to Jeff Squyres
    <jsquyres@cisco.com> for prototyping support for this in Open MPI, which
    helped find several bugs during development.
    Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
    16e36679
ummunotify.txt 6.42 KB