• Sean Hefty's avatar
    RDMA/cma: Fix deadlock destroying listen requests · d02d1f53
    Sean Hefty authored
    Deadlock condition reported by Kanoj Sarcar <kanoj@netxen.com>.
    The deadlock occurs when a connection request arrives at the same
    time that a wildcard listen is being destroyed.
    
    A wildcard listen maintains per device listen requests for each
    RDMA device in the system.  The per device listens are automatically
    added and removed when RDMA devices are inserted or removed from
    the system.
    
    When a wildcard listen is destroyed, rdma_destroy_id() acquires
    the rdma_cm's device mutex ('lock') to protect against hot-plug
    events adding or removing per device listens.  It then tries to
    destroy the per device listens by calling ib_destroy_cm_id() or
    iw_destroy_cm_id().  It does this while holding the device mutex.
    
    However, if the underlying iw/ib CM reports a connection request
    while this is occurring, the rdma_cm callback function will try
    to acquire the same device mutex.  Since we're in a callback,
    the ib_destroy_cm_id() or iw_destroy_cm_id() calls will block until
    their callback thread returns, but the callback is blocked waiting for
    the device mutex.
    
    Fix this by re-working how per device listens are destroyed.  Use
    rdma_destroy_id(), which avoids the deadlock, in place of
    cma_destroy_listen().  Additional synchronization is added to handle
    device hot-plug events and ensure that the id is not destroyed twice.
    Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
    Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
    d02d1f53
cma.c 69.4 KB