• Ian Campbell's avatar
    xen: fix hang on suspend. · c5cae661
    Ian Campbell authored
    In 65f63384 "xen: improve error handling in do_suspend" I said:
        - xs_suspend()/xs_resume() and dpm_suspend_noirq()/dpm_resume_noirq() were not
          nested in the obvious way.
    and changed the ordering of the calls as so:
        BEFORE		AFTER
        xs_suspend		dpm_suspend_noirq
        dpm_suspend_noirq	xs_suspend
        *SUSPEND*		*SUSPEND*
        dpm_resume_noirq	dpm_resume_noirq
        xs_resume		xs_resume
    Clearly this is not an improvement and I was talking rubbish.
    
    In particular the new ordering is susceptible to a hang if a xenstore write is
    in progress at the point at which the suspend kicks in. When the suspend
    process calls xs_suspend it tries to take the request_mutex but if a write is
    in progress it could be looping in xenbus_xs.c:read_reply() waiting for
    something to arrive on &xs_state.reply_list while holding the request_mutex
    (taken in the caller of read_reply).
    
    However if we have done dpm_suspend_noirq before xs_suspend then we won't get
    any more xenstore interrupts and process_msg() will never be woken up to add
    anything to the reply_list.
    
    Fix this by calling xs_suspend before dpm_suspend_noirq. If dpm_suspend_noirq
    fails then make sure we go through the xs_suspend_cancel() code path.
    Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
    Acked-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
    Cc: Stable Kernel <stable@kernel.org>
    c5cae661
manage.c 5.57 KB