-
Linus Torvalds authored
When I rewrote tty ldisc code to use proper reference counts (commits 65b77046 and cbe9352f) in order to avoid a race with hangup, the test-program that Eric Biederman used to trigger the original problem seems to have exposed another long-standing bug: the hangup code did the 'tty_ldisc_halt()' to stop any buffer flushing activity, but unlike the other call sites it never actually flushed any pending work. As a result, if you get just the right timing, the pending work may be just about to execute (ie the timer has already triggered and thus cancel_delayed_work() was a no-op), when we then re-initialize the ldisc from under it. That, in turn, results in various random problems, usually seen as a NULL pointer dereference in run_timer_softirq() or a BUG() in worker_thread (but it can be almost anything). Fix it by adding the required 'flush_scheduled_work()' after doing the tty_ldisc_halt() (this also requires us to move the ldisc halt to before taking the ldisc mutex in order to avoid a deadlock with the workqueue executing do_tty_hangup, which requires the mutex). The locking should be cleaned up one day (the requirement to do this outside the ldisc_mutex is very annoying, and weakens the lock), but that's a larger and separate undertaking. Reported-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Xiaotian Feng <xtfeng@gmail.com> Tested-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Tested-by: Dave Young <hidave.darkstar@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5c58ceff