• Zhang, Yanmin's avatar
    x86: fix cpu hotplug crash · fcb43042
    Zhang, Yanmin authored
    Vegard Nossum reported crashes during cpu hotplug tests:
    
      http://marc.info/?l=linux-kernel&m=121413950227884&w=4
    
    In function _cpu_up, the panic happens when calling
    __raw_notifier_call_chain at the second time. Kernel doesn't panic when
    calling it at the first time. If just say because of nr_cpu_ids, that's
    not right.
    
    By checking the source code, I found that function do_boot_cpu is the culprit.
    Consider below call chain:
     _cpu_up=>__cpu_up=>smp_ops.cpu_up=>native_cpu_up=>do_boot_cpu.
    
    So do_boot_cpu is called in the end. In do_boot_cpu, if
    boot_error==true, cpu_clear(cpu, cpu_possible_map) is executed. So later
    on, when _cpu_up calls __raw_notifier_call_chain at the second time to
    report CPU_UP_CANCELED, because this cpu is already cleared from
    cpu_possible_map, get_cpu_sysdev returns NULL.
    
    Many resources are related to cpu_possible_map, so it's better not to
    change it.
    
    Below patch against 2.6.26-rc7 fixes it by removing the bit clearing in
    cpu_possible_map.
    Signed-off-by: default avatarZhang Yanmin <yanmin_zhang@linux.intel.com>
    Tested-by: default avatarVegard Nossum <vegard.nossum@gmail.com>
    Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    fcb43042
smpboot.c 35.8 KB