- 12 Aug, 2008 14 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: fix spinlock recursion in hvc_console stop_machine: remove unused variable modules: extend initcall_debug functionality to the module loader export virtio_rng.h lguest: use get_user_pages_fast() instead of get_user_pages() mm: Make generic weak get_user_pages_fast and EXPORT_GPL it lguest: don't set MAC address for guest unless specified
-
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6Linus Torvalds authored
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp: fix SIS 5591/5592 wrong PCI id intel/agp: rewrite GTT on resume agp: use dev_printk when possible amd64-agp: run fallback when no bridges found, not when driver registration fails intel_agp: official name for GM45 chipset
-
Christian Borntraeger authored
commit 611e097d Author: Christian Borntraeger <borntraeger@de.ibm.com> hvc_console: rework setup to replace irq functions with callbacks introduced a spinlock recursion problem. request_irq tries to call the handler if the IRQ is shared. The irq handler of hvc_console calls hvc_poll and hvc_kill which might take the hvc_struct spinlock. Therefore, we have to call request_irq outside the spinlock. We can move the notifier_add safely outside the spinlock as ->data must not be changed by the backend. Otherwise, tty_hangup would fail anyway. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
Li Zefan authored
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
Arjan van de Ven authored
The kernel has this really nice facility where if you put "initcall_debug" on the kernel commandline, it'll print which function it's going to execute just before calling an initcall, and then after the call completes it will 1) print if it had an error code 2) checks for a few simple bugs (like leaving irqs off) and 3) print how long the init call took in milliseconds. While trying to optimize the boot speed of my laptop, I have been loving number 3 to figure out what to optimize... ... and then I wished that the same thing was done for module loading. This patch makes the module loader use this exact same functionality; it's a logical extension in my view (since modules are just sort of late binding initcalls anyway) and so far I've found it quite useful in finding where things are too slow in my boot. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
Christian Borntraeger authored
Hello Rusty, The entropy device was added after we exported all virtio headers. This patch adds virtio_rng.h to the exportable userspace headers. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
Using a simple page table thrashing program I measure a slight improvement. The program creates five processes. Each touches 1000 pages then schedules the next process. We repeat this 1000 times. As lguest only caches 4 cr3 values, this rebuilds a lot of shadow page tables requiring virt->phys mappings. Before: 5.93 seconds After: 5.40 seconds (Counts of slow vs fastpath in this usage are 6092 and 2852462 respectively.) And more importantly for lguest, the code is simpler. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
Out of line get_user_pages_fast fallback implementation, make it a weak symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST. Export the symbol to modules so lguest can use it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
Rusty Russell authored
This shows up when trying to bridge: tap0: received packet with own address as source address As Max Krasnyansky points out, there's no reason to give the guest the same mac address as the TUN device. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Max Krasnyansky <maxk@qualcomm.com>
-
Krzysztof Helt authored
The correct id is the id of the main host (5591) not the id of the PCI-to-PCI bridge AGP (0001). Output from "lspci -nv" shows that only the former has AGP capabilities flag set: 00:00.0 0600: 1039:5591 (rev 02) Flags: bus master, medium devsel, latency 64 Memory at ec000000 (32-bit, non-prefetchable) [size=32M] Capabilities: [c0] AGP version 1.0 00:02.0 0604: 1039:0001 (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 0000c000-0000cfff Memory behind bridge: eb500000-eb5fffff Prefetchable memory behind bridge: eb300000-eb3fffff Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Dave Airlie <airlied@redhat.com>
-
Keith Packard authored
On my Intel chipset (965GM), the GTT is entirely erased across suspend/resume. This patch simply re-plays the current mapping at resume time to restore the table.=20 I noticed this once I started relying on persistent GTT mappings across VT switch in our GEM work -- the old X server and DRM code carefully unbind all memory from the GTT on VT switch, but GEM does not bother. I placed the list management and rewrite code in the generic layer on the assumption that it will be needed on other hardware, but I did not add the rewrite call to anything other than the Intel resume function. Keep a list of current GATT mappings. At resume time, rewrite them into the GATT. This is needed on Intel (at least) as the entire GATT is cleared across suspend/resume. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Keith Packard <keithp@keithp.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
-
Bjorn Helgaas authored
Convert printks to use dev_printk(). Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
-
Bjorn Helgaas authored
I think the intent was that if no bridges matched agp_amd64_pci_table[], we would fall back to checking for any bridge with the AGP capability. But in the current code, we execute the fallback path only when pci_register_driver() itself fails, which is unrelated to whether any matching devices were found. This patch counts the AGP bridges found in the probe() method and executes the fallback path when none is found. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
-
Zhenyu Wang authored
Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
-
- 11 Aug, 2008 26 commits
-
-
Linus Torvalds authored
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched, cpu hotplug: fix set_cpus_allowed() use in hotplug callbacks sched: fix mysql+oltp regression sched_clock: delay using sched_clock() sched clock: couple local and remote clocks sched clock: simplify __update_sched_clock() sched: eliminate scd->prev_raw sched clock: clean up sched_clock_cpu() sched clock: revert various sched_clock() changes sched: move sched_clock before first use sched: test runtime rather than period in global_rt_runtime() sched: fix SCHED_HRTICK dependency sched: fix warning in hrtick_start_fair()
-
Linus Torvalds authored
Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: posix-timers: fix posix_timer_event() vs dequeue_signal() race posix-timers: do_schedule_next_timer: fix the setting of ->si_overrun
-
Linus Torvalds authored
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: fix debug_lock_alloc lockdep: increase MAX_LOCKDEP_KEYS generic-ipi: fix stack and rcu interaction bug in smp_call_function_mask() lockdep: fix overflow in the hlock shrinkage code lockdep: rename map_[acquire|release]() => lock_map_[acquire|release]() lockdep: handle chains involving classes defined in modules mm: fix mm_take_all_locks() locking order lockdep: annotate mm_take_all_locks() lockdep: spin_lock_nest_lock() lockdep: lock protection locks lockdep: map_acquire lockdep: shrink held_lock structure lockdep: re-annotate scheduler runqueues lockdep: lock_set_subclass - reset a held lock's subclass lockdep: change scheduler annotation debug_locks: set oops_in_progress if we will log messages. lockdep: fix combinatorial explosion in lock subgraph traversal
-
Linus Torvalds authored
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix 2.6.27rc1 cannot boot more than 8CPUs x86: make "apic" an early_param() on 32-bit, NULL check EFI, x86: fix function prototype x86, pci-calgary: fix function declaration x86: work around gcc 3.4.x bug x86: make "apic" an early_param() on 32-bit x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk x86_64: restore the proper NR_IRQS define so larger systems work. x86: Restore proper vector locking during cpu hotplug x86: Fix broken VMI in 2.6.27-rc.. x86: fdiv bug detection fix
-
Ingo Molnar authored
-
Ingo Molnar authored
-
Peter Zijlstra authored
When we enable DEBUG_LOCK_ALLOC but do not enable PROVE_LOCKING and or LOCK_STAT, lock_alloc() and lock_release() turn into nops, even though we should be doing hlock checking (check=1). This causes a false warning and a lockdep self-disable. Rectify this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Yinghai Lu authored
Jeff Chua reported that booting a !bigsmp kernel on a 16-way box hangs silently. this is a long-standing issue, smp start AP cpu could check the apic id >=8 etc before trying to start it. achieve this by moving the def_to_bigsmp check later and skip the apicid id > 8 [ mingo@elte.hu: clean up the message that is printed. ] Reported-by: "Jeff Chua" <jeff.chua.linux@gmail.com> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> arch/x86/kernel/setup.c | 6 ------ arch/x86/kernel/smpboot.c | 10 ++++++++++ 2 files changed, 10 insertions(+), 6 deletions(-)
-
Adrian Bunk authored
This patch makes several needlessly global struct scsi_dh_devlist's static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.o-hand.com/linux-mfdLinus Torvalds authored
* 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: tc6393 cleanup and update mfd: have TMIO drivers and subdevices depend on ARM mfd: TMIO MMC driver mfd: driver for the TMIO NAND controller mfd: t7l66 MMC platform data mfd: tc6387 MMC platform data mfd: Fix 7l66 and 6387 according to the new mfd-core API mfd: Fix tc6393 according to the new tmio.h mfd: driver for the TC6387XB TMIO controller. mfd: driver for the T7L66XB TMIO SoC mfd: TMIO MMC structures and accessors.
-
git://jdelvare.pck.nerim.net/jdelvare-2.6Linus Torvalds authored
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: hwmon: (lm75) Drop legacy i2c driver i2c: correct some size_t printk formats i2c: Check for address business before creating clients i2c: Let users select algorithm drivers manually again i2c: Fix NULL pointer dereference in i2c_new_probed_device i2c: Fix oops on bus multiplexer driver loading
-
git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: [WATCHDOG] pcwd.c - fix open_allowed type. [WATCHDOG] fix watchdog/ixp4xx_wdt.c compilation [WATCHDOG] fix watchdog/wdt285.c compilation [WATCHDOG] fix watchdog/at91rm9200_wdt.c compilation [WATCHDOG] fix watchdog/shwdt.c compilation [WATCHDOG] fix watchdog/txx9wdt.c compilation [WATCHDOG] MAINTAINERS: remove ZF MACHZ WATCHDOG entry [WATCHDOG] Fix build with CONFIG_ITCO_VENDOR_SUPPORT=n
-
Rene Herman authored
Cyrill Gorcunov observed: > you turned it into early_param so now it's NULL injecting vulnerabled. > Could you please add checking for NULL str param? fix that. Also, change the name of 'str' into 'arg', to make it more apparent that this is an optional argument that can be NULL, not a string parameter that is empty when unset. Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpcLinus Torvalds authored
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc: Remove include/linux/harrier_defs.h powerpc: Do not ignore arch/powerpc/include powerpc: Delete completed "ppc removal" task from feature removal file powerpc/mm: Fix attribute confusion with htab_bolt_mapping() powerpc/pci: Don't keep ISA memory hole resources in the tree powerpc: Zero fill the return values of rtas argument buffer powerpc/4xx: Update defconfig files for 2.6.27-rc1 powerpc/44x: Incorrect NOR offset in Warp DTS powerpc/44x: Warp DTS changes for board updates powerpc/4xx: Cleanup Warp for i2c driver changes. powerpc/44x: Adjust warp-nand resource end address
-
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: Limit VPD length for Broadcom 5708S PCI PM: Export pci_pme_active to drivers PCI: remove duplicate symbol from pci_ids.h PCI: check the return value of device_create_bin_file() in pci_create_bus() PCI: fully restore MSI state at resume time DMA: make dma-coherent.c documentation kdoc-friendly PCI: make pci_register_driver() a macro PCI: add Broadcom 5708S to VPD length quirk
-
Christian Borntraeger authored
While testing our KVM code for s390 (starting and killall kvm in a loop) I can reproduce the following oops: Unable to handle kernel pointer dereference at virtual kernel address 6b6b6b6b6b6b6000 Oops: 0038 [#1] SMP Modules linked in: dm_multipath sunrpc qeth_l3 qeth_l2 dm_mod qeth ccwgroup CPU: 1 Not tainted 2.6.27-rc1 #54 Process kuli (pid: 4409, task: 00000000b6aa5940, ksp: 00000000b7343e10) Krnl PSW : 0704e00180000000 00000000002e0b8c (disassociate_ctty+0x1c0/0x288) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000000 6b6b6b6b6b6b6b6b 0000000000000001 00000000000003a6 00000000002e0a46 00000000004b4160 0000000000000001 00000000bbd79758 00000000b7343e58 00000000b8854148 00000000bd34dea0 00000000b7343c20 0000000000000001 00000000004b6d08 00000000002e0a46 00000000b7343c20 Krnl Code: 00000000002e0b7e: eb9fb0a00004 lmg %r9,%r15,160(%r11) 00000000002e0b84: 07f4 bcr 15,%r4 00000000002e0b86: e31090080004 lg %r1,8(%r9) >00000000002e0b8c: d501109cd000 clc 156(2,%r1),0(%r13) 00000000002e0b92: a784ff5d brc 8,2e0a4c 00000000002e0b96: b9040029 lgr %r2,%r9 00000000002e0b9a: c0e5fffff9c3 brasl %r14,2dff20 00000000002e0ba0: a7f4ff56 brc 15,2e0a4c Call Trace: ([<00000000002e0a46>] disassociate_ctty+0x7a/0x288) [<0000000000141fe6>] do_exit+0x212/0x8d4 [<0000000000142708>] do_group_exit+0x60/0xcc [<0000000000150660>] get_signal_to_deliver+0x270/0x3ac [<000000000010bfd6>] do_signal+0x8e/0x8dc [<0000000000113772>] sysc_sigpending+0xe/0x22 [<000001ff0000b134>] 0x1ff0000b134 INFO: lockdep is turned off. Last Breaking-Event-Address: [<00000000002e0a48>] disassociate_ctty+0x7c/0x288 Kernel panic - not syncing: Fatal exception: panic_on_oops It seems that tty was already free in disassocate_ctty when it tries to dereference tty->driver. After moving the lock_kernel before the mutex_unlock, I can no longer reproduce the problem. [ This is a temporary partial fix for the documented and long standing race in disassociate_tty. This stops most problem cases for now. For the next release the -next tree has an initial implementation of kref counting for tty structures and this quickfix will be dropped. - Alan ] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by; Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Geert Uytterhoeven authored
Wire up for m68k{,nommu} the system calls that were added in the last merge window: - 4006553b ("flag parameters: inotify_init") - ed8cae8b ("flag parameters: pipe") - 336dd1f7 ("flag parameters: dup2") - a0998b50 ("flag parameters: epoll_create") - 9fe5ad9c ("flag parameters add-on: remove epoll_create size param") - b087498e ("flag parameters: eventfd") - 9deb27ba ("flag parameters: signalfd") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
This reverts commit 2d04a4a7, which made it impossible to make the softcursor use the highlight colors. Yes, the fourth bit should be "blinking", but since we cannot reasonably blink in fbcon, highlighting it with a bright background is preferable. Reported-by: Pavel Machek <pavel@suse.cz> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Antonino A. Daplas <adaplas@pol.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Randy Dunlap authored
Fix function prototype in header file to match source code: linux-next-20080807/arch/x86/kernel/efi_64.c:100:14: error: symbol 'efi_ioremap' redeclared with different type (originally declared at include2/asm/efi.h:89) - different address spaces Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Randy Dunlap authored
Fix function declaration: linux-next-20080807/arch/x86/kernel/pci-calgary_64.c:1353:36: warning: non-ANSI function declaration of function 'get_tce_space_from_tar' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Jeremy Fitzhardinge authored
Simon Horman reported that gcc-3.4.x crashes when compiling pgd_prepopulate_pmd() when PREALLOCATED_PMDS == 0 and CONFIG_DEBUG_INFO is enabled. Adding an extra check for PREALLOCATED_PMDS == 0 [which is compiled out by gcc] seems to avoid the problem. Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Rene Herman authored
On 32-bit, "apic" is a __setup() param meaning it is parsed rather late in the game. Make it an early_param() for apic_printk() use by arch/x86/kernel/mpparse.c. On 64-bit, it already is an early_param(). Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Rene Herman authored
commit 11a62a05 turns some formerly nopped debugging printks in arch/x86/kernel/mppparse.c into regular ones. The one at the top of smp_scan_config() in particular also prints on !CONFIG_SMP/CONFIG_X86_LOCAL_APIC kernels and UP machines without anything resembling MP tables which makes their lowly UP owners wonder... Turn the former Dprintk()s into apic_printk()s instead meaning that their printing is dependent on passing the apic=verbose (or =debug) command line param. On 32-bit, "apic" is a __setup() param which isn't early enough for this code and therefore needs a followup changing it into an early_param(). On 64-bit, it already is. Signed-off-by: Rene Herman <rene.herman@gmail.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Dmitry Adamushko authored
Mark Langsdorf reported: > One of my co-workers noticed that the powernow-k8 > driver no longer restarts when a CPU core is > hot-disabled and then hot-enabled on AMD quad-core > systems. > > The following comands work fine on 2.6.26 and fail > on 2.6.27-rc1: > > echo 0 > /sys/devices/system/cpu/cpu3/online > echo 1 > /sys/devices/system/cpu/cpu3/online > find /sys -name cpufreq > > For 2.6.26, the find will return a cpufreq > directory for each processor. In 2.6.27-rc1, > the cpu3 directory is missing. > > After digging through the code, the following > logic is failing when the core is hot-enabled > at runtime. The code works during the boot > sequence. > > cpumask_t = current->cpus_allowed; > set_cpus_allowed_ptr(current, &cpumask_of_cpu(cpu)); > if (smp_processor_id() != cpu) > return -ENODEV; So set the CPU active before calling the CPU_ONLINE notifier chain, there are a handful of notifiers that use set_cpus_allowed(). This fix also solves the problem with x86-microcode. I've sent alternative patches for microcode, but as this "rely on set_cpus_allowed_ptr() being workable in cpu-hotplug(CPU_ONLINE, ...)" assumption seems to be more broad than what we thought, perhaps this fix should be applied. With this patch we define that by the moment CPU_ONLINE is being sent, a 'cpu' is online and ready for tasks to be migrated onto it. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Reported-by: Mark Langsdorf <mark.langsdorf@amd.com> Tested-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
certain configs produce: [ 70.076229] BUG: MAX_LOCKDEP_KEYS too low! [ 70.080230] turning off the locking correctness validator. tune them up. Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Nick Piggin authored
* Venki Pallipadi <venkatesh.pallipadi@intel.com> wrote: > Found a OOPS on a big SMP box during an overnight reboot test with > upstream git. > > Suresh and I looked at the oops and looks like the root cause is in > generic_smp_call_function_interrupt() and smp_call_function_mask() with > wait parameter. > > The actual oops looked like > > [ 11.277260] BUG: unable to handle kernel paging request at ffff8802ffffffff > [ 11.277815] IP: [<ffff8802ffffffff>] 0xffff8802ffffffff > [ 11.278155] PGD 202063 PUD 0 > [ 11.278576] Oops: 0010 [1] SMP > [ 11.279006] CPU 5 > [ 11.279336] Modules linked in: > [ 11.279752] Pid: 0, comm: swapper Not tainted 2.6.27-rc2-00020-g685d87f7 #290 > [ 11.280039] RIP: 0010:[<ffff8802ffffffff>] [<ffff8802ffffffff>] 0xffff8802ffffffff > [ 11.280692] RSP: 0018:ffff88027f1f7f70 EFLAGS: 00010086 > [ 11.280976] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000 > [ 11.281264] RDX: 0000000000004f4e RSI: 0000000000000001 RDI: 0000000000000000 > [ 11.281624] RBP: ffff88027f1f7f98 R08: 0000000000000001 R09: ffffffff802509af > [ 11.281925] R10: ffff8800280c2780 R11: 0000000000000000 R12: ffff88027f097d48 > [ 11.282214] R13: ffff88027f097d70 R14: 0000000000000005 R15: ffff88027e571000 > [ 11.282502] FS: 0000000000000000(0000) GS:ffff88027f1c3340(0000) knlGS:0000000000000000 > [ 11.283096] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > [ 11.283382] CR2: ffff8802ffffffff CR3: 0000000000201000 CR4: 00000000000006e0 > [ 11.283760] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 11.284048] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 11.284337] Process swapper (pid: 0, threadinfo ffff88027f1f2000, task ffff88027f1f0640) > [ 11.284936] Stack: ffffffff80250963 0000000000000212 0000000000ee8c78 0000000000ee8a66 > [ 11.285802] ffff88027e571550 ffff88027f1f7fa8 ffffffff8021adb5 ffff88027f1f3e40 > [ 11.286599] ffffffff8020bdd6 ffff88027f1f3e40 <EOI> ffff88027f1f3ef8 0000000000000000 > [ 11.287120] Call Trace: > [ 11.287768] <IRQ> [<ffffffff80250963>] ? generic_smp_call_function_interrupt+0x61/0x12c > [ 11.288354] [<ffffffff8021adb5>] smp_call_function_interrupt+0x17/0x27 > [ 11.288744] [<ffffffff8020bdd6>] call_function_interrupt+0x66/0x70 > [ 11.289030] <EOI> [<ffffffff8024ab3b>] ? clockevents_notify+0x19/0x73 > [ 11.289380] [<ffffffff803b9b75>] ? acpi_idle_enter_simple+0x18b/0x1fa > [ 11.289760] [<ffffffff803b9b6b>] ? acpi_idle_enter_simple+0x181/0x1fa > [ 11.290051] [<ffffffff8053aeca>] ? cpuidle_idle_call+0x70/0xa2 > [ 11.290338] [<ffffffff80209f61>] ? cpu_idle+0x5f/0x7d > [ 11.290723] [<ffffffff8060224a>] ? start_secondary+0x14d/0x152 > [ 11.291010] > [ 11.291287] > [ 11.291654] Code: Bad RIP value. > [ 11.292041] RIP [<ffff8802ffffffff>] 0xffff8802ffffffff > [ 11.292380] RSP <ffff88027f1f7f70> > [ 11.292741] CR2: ffff8802ffffffff > [ 11.310951] ---[ end trace 137c54d525305f1c ]--- > > The problem is with the following sequence of events: > > - CPU A calls smp_call_function_mask() for CPU B with wait parameter > - CPU A sets up the call_function_data on the stack and does an rcu add to > call_function_queue > - CPU A waits until the WAIT flag is cleared > - CPU B gets the call function interrupt and starts going through the > call_function_queue > - CPU C also gets some other call function interrupt and starts going through > the call_function_queue > - CPU C, which is also going through the call_function_queue, starts referencing > CPU A's stack, as that element is still in call_function_queue > - CPU B finishes the function call that CPU A set up and as there are no other > references to it, rcu deletes the call_function_data (which was from CPU A > stack) > - CPU B sees the wait flag and just clears the flag (no call_rcu to free) > - CPU A which was waiting on the flag continues executing and the stack > contents change > > - CPU C is still in rcu_read section accessing the CPU A's stack sees > inconsistent call_funation_data and can try to execute > function with some random pointer, causing stack corruption for A > (by clearing the bits in mask field) and oops. Nice debugging work. I'd suggest something like the attached (boot tested) patch as the simple fix for now. I expect the benefits from the less synchronized, multiple-in-flight-data global queue will still outweigh the costs of dynamic allocations. But if worst comes to worst then we just go back to a globally synchronous one-at-a-time implementation, but that would be pretty sad! Signed-off-by: Ingo Molnar <mingo@elte.hu>
-