x86: mce: Fix thermal throttling message storm
commit b417c9fd upstream. If a system switches back and forth between hot and cold mode, the MCE code will print a stream of critical kernel messages. Extend the throttling code to properly notice this, by only printing the first hot + cold transition and omitting the rest up to CHECK_INTERVAL (5 minutes). This way we'll only get a single incident of: [ 102.356584] CPU0: Temperature above threshold, cpu clock throttled (total events = 1) [ 102.357000] Disabling lock debugging due to kernel taint [ 102.369223] CPU0: Temperature/speed normal Every 5 minutes. The 'total events' count tells the number of cold/hot transitions detected, should overheating occur after 5 minutes again: [ 402.357580] CPU0: Temperature above threshold, cpu clock throttled (total events = 24891) [ 402.358001] CPU0: Temperature/speed normal [ 450.704142] Machine check events logged Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Showing
Please register or sign in to comment