- 12 Oct, 2009 1 commit
-
-
Darren Hart authored
PI futexes do not use the same plist_node_empty() test for wakeup. It was possible for the waiter (in futex_wait_requeue_pi()) to set TASK_INTERRUPTIBLE after the waker assigned the rtmutex to the waiter. The waiter would then note the plist was not empty and call schedule(). The task would not be found by any subsequeuent futex wakeups, resulting in a userspace hang. By moving the setting of TASK_INTERRUPTIBLE to before the call to queue_me(), the race with the waker is eliminated. Since we no longer call get_user() from within queue_me(), there is no need to delay the setting of TASK_INTERRUPTIBLE until after the call to queue_me(). The FUTEX_LOCK_PI operation is not affected as futex_lock_pi() relies entirely on the rtmutex code to handle schedule() and wakeup. The requeue PI code is affected because the waiter starts as a non-PI waiter and is woken on a PI futex. Remove the crusty old comment about holding spinlocks() across get_user() as we no longer do that. Correct the locking statement with a description of why the test is performed. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> LKML-Reference: <20090922053038.8717.97838.stgit@Aeon> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 09 Oct, 2009 1 commit
-
-
Thomas Gleixner authored
-RT replaces kmap_atomic* functions with macros, but we kept the function prototypes around. Remove them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 08 Oct, 2009 3 commits
-
-
Thomas Gleixner authored
exit_pi_state() is called from do_exit() but not from do_execve(). Move it to release_mm() so it gets called from do_execve() as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: stable@kernel.org Cc: Anirban Sinha <ani@anirban.org> Cc: Peter Zijlstra <peterz@infradead.org>
-
Peter Zijlstra authored
The robust list pointers of user space held futexes are kept intact over an exec() call. When the exec'ed task exits exit_robust_list() is called with the stale pointer. The risk of corruption is minimal, but still it is incorrect to keep the pointers valid. Actually glibc should uninstall the robust list before calling exec() but we have to deal with it anyway. Nullify the pointers after [compat_]exit_robust_list() has been called. Reported-by: Anirban Sinha <ani@anirban.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: stable@kernel.org
-
Eero Nurkkala authored
Commit f2e21c96 had unfortunate side effects with cpufreq governors on some systems. If the system did not switch into NOHZ mode ts->inidle is not set when tick_nohz_stop_sched_tick() is called from the idle routine. Therefor all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick() fail to call tick_nohz_start_idle(). This results in bogus idle accounting information which is passed to cpufreq governors. Set the inidle flag unconditionally of the NOHZ active state to keep the idle time accounting correct in any case. [ tglx: Added comment and tweaked the changelog ] Reported-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Eero Nurkkala <ext-eero.nurkkala@nokia.com> Cc: Rik van Riel <riel@redhat.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Greg KH <greg@kroah.com> Cc: Steven Noonan <steven@uplinklabs.net> Cc: stable@kernel.org LKML-Reference: <1254907901.30157.93.camel@eenurkka-desktop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 07 Oct, 2009 1 commit
-
-
Darren Hart authored
If futex_wait_requeue_pi() wakes prior to requeue, we drop the reference to the source futex_key twice, once in handle_early_requeue_pi_wakeup() and once on our way out. Remove the drop from the handle_early_requeue_pi_wakeup() and keep the get/drops together in futex_wait_requeue_pi(). Reported-by: Helge Bahmann <hcb@chaoticmind.net> Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Helge Bahmann <hcb@chaoticmind.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@us.ibm.com> Cc: stable-2.6.31 <stable@kernel.org> LKML-Reference: <4ACCE21E.5030805@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 06 Oct, 2009 1 commit
-
-
Thomas Gleixner authored
Remy Bohmer pointed out that we create the hrtimer softirq thread even when CONFIG_HIGH_RES_TIMERS is off. That results in a softirq-NULL name for the thread. Skip the thread creation/wakeup/teardown when CONFIG_HIGH_RES_TIMERS=n Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 04 Oct, 2009 1 commit
-
-
Thomas Gleixner authored
commit 21ece08c (net: fix the xtables smp_processor_id assumptions for -rt) fixed only half of the problem. The filter functions might run in thread context and can be preempted and migrated on -RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 18 Sep, 2009 1 commit
-
-
Thomas Gleixner authored
Memory allocations in irq/preempt disabled regions is the main cause of grief with these features. Needs some real work to get that solved. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 17 Sep, 2009 1 commit
-
-
Thomas Gleixner authored
latency_lock is taken in the guts of the scheduler code and needs to be a real spinlock on RT. convert it to atomic_spinlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 15 Sep, 2009 7 commits
-
-
Carsten Emde authored
Resuscitated and enhanced the kernel latency histograms provided originally by Yi Yang and adapted and converted by Steven Rostedt. Latency histograms in the current version - can be enabled online and independently - have virtually no performance penalty when configured but not enabled - have very little performance penalty when enabled - use already available wakeup and switch tracepoints - give corresponding results with the related tracer - allow to record wakeup latency histograms of a single process - record the process where the highest wakeup latency occurred - are documented in Documentation/trace/histograms.txt Signed-off-by: Carsten Emde <C.Emde@osadl.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <4AAEDDD5.4040505@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Wu Zhangjin authored
The commit(f8382688) have converted swap to percpu locked, the non-RT version of swap_get_cpu_var should be the same as the old implementation, but in reality, it not works as the old one: ... +#define swap_get_cpu_var(var, cpu) \ + ({ \ + (void)cpu; \ + &get_cpu_var(var); \ + }) ... void __lru_cache_add(struct page *page, enum lru_list lru) { - struct pagevec *pvec = &get_cpu_var(lru_add_pvecs)[lru]; + struct pagevec *pvec; + int cpu; + pvec = swap_get_cpu_var(lru_add_pvecs, cpu)[lru]; page_cache_get(page); if (!pagevec_add(pvec, page)) ____pagevec_lru_add(pvec, lru); - put_cpu_var(lru_add_pvecs); + swap_put_cpu_var(lru_add_pvecs, cpu); } Here is the point, the old version: pvec = &get_cpu_var(lru_add_pvecs)[lru]; = & (get_cpu_var(lru_add_pvecs)[lru]); new version from commit f8382688: pvec = ({ (void)cpu; &get_cpu_var(lru_add_pvecs); })[lru]; = (&get_cpu_var(lru_add_pvecs)) [lru]; so, we can see, these two are really different. and it made the non-RT boot fail: ... ide-gd driver 1.18 hda: max request size: 512KiB hda: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63 hda: cache flushes supported hda:Unhandled kernel unaligned access[#1]: Cpu 0 $ 0 : 0000000000000000 000000001400c4e1 98000000013699d0 0000000000000000 $ 4 : 0000000000000000 98000000be04f980 0000000000000010 000000007fd78b57 $ 8 : 0000000000000001 0000000000200200 0000000000100100 98000000be00f210 $12 : 000000001400c4e1 000000001000001e ffffffffffffffff 98000000bd0180a8 $16 : 0000000000000000 98000000be04f998 fffffffb81a72600 ffffffff802d3270 $20 : 0000000000000000 0000000000000000 ffffffffffffffef ffffffff80667a90 $24 : 0000000000000228 ffffffff803ded88 $28 : 98000000be04c000 98000000be04f950 00000000003fffff ffffffff80200404 Hi : 0000000000000000 Lo : 0000000000000320 epc : ffffffff80217194 do_ade+0x298/0x3bc Not tainted ra : ffffffff80200404 ret_from_exception+0x0/0x10 Status: 1400c4e3 KX SX UX KERNEL EXL IE Cause : 00000014 BadVA : fffffffb81a72607 PrId : 00006303 (ICT Loongson-2) Modules linked in: Process swapper (pid: 1, threadinfo=98000000be04c000, task=98000000be04b7d8, tls=0000000000000000) Stack : ffffffff80666f60 ffffffff8025165c 98000000013699d0 0000000000000001 98000000bd00ae30 ffffffff80200404 0000000000000000 000000001400c4e1 000000007fd78b57 98000000013699d0 ffffffff80638070 0000000000000002 fffffffb81a72600 000000007fd78b57 0000000000000001 0000000000200200 0000000000100100 98000000be00f210 98000000be00f220 000000000000001d ffffffffffffffff 98000000bd0180a8 98000000013699d0 0000000000000001 98000000bd00ae30 ffffffff802d3270 0000000000000000 0000000000000000 ffffffffffffffef ffffffff80667a90 0000000000000228 ffffffff803ded88 98000000bd00ae30 98000000bd00ae30 98000000be04c000 98000000be04fab0 00000000003fffff ffffffff80275350 000000001400c4e3 0000000000000000 ... Call Trace: [<ffffffff80217194>] do_ade+0x298/0x3bc [<ffffffff80200404>] ret_from_exception+0x0/0x10 [<ffffffff8027fafc>] __lru_cache_add+0x94/0xd8 [<ffffffff80275350>] add_to_page_cache_lru+0x84/0xa8 [<ffffffff80275520>] read_cache_page_async+0xa8/0x1dc [<ffffffff80275664>] read_cache_page+0x10/0x74 [<ffffffff802fed34>] read_dev_sector+0x34/0xe0 [<ffffffff802ff96c>] adfspart_check_ICS+0x44/0x1b0 [<ffffffff802ff6e4>] rescan_partitions+0x178/0x3a8 [<ffffffff802d3840>] __blkdev_get+0x238/0x318 [<ffffffff802feeb0>] register_disk+0xd0/0x15c [<ffffffff8037c1e8>] add_disk+0xcc/0x128 [<ffffffff803fcbc4>] ide_gd_probe+0x170/0x1d0 [<ffffffff803e6e08>] driver_probe_device+0xbc/0x180 [<ffffffff803e6f38>] __driver_attach+0x6c/0xa4 [<ffffffff803e6508>] bus_for_each_dev+0x58/0xa4 [<ffffffff803e5bbc>] bus_add_driver+0xc8/0x284 [<ffffffff803e72d8>] driver_register+0xc4/0x17c [<ffffffff8020fa5c>] do_one_initcall+0x64/0x18c [<ffffffff806701d8>] kernel_init+0xe0/0x14c [<ffffffff80212e5c>] kernel_thread_helper+0x10/0x18 Code: 001188f8 00b1882d de220000 <b2420007> b6420000 24120000 1640000c 00a0202d 8ca20120 Disabling lock debugging due to kernel taint note: swapper[1] exited with preempt_count 1 Kernel panic - not syncing: Attempted to kill init! This patch will keep the swap_get_cpu_var as the one before commit f8382688, and put "(void)cpu;" to swap_put_cpu_var() to avoid warning about unused variable. Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com> LKML-Reference: <1252034522-32653-1-git-send-email-wuzhangjin@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
We know already which code pathes trigger this so we can safely disable it again and just keep the early return when mask == 0. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
kvm->requests_lock is a sleeping lock in RT, but it's locked inside the preempt disabled region of get_cpu(). Move the get_cpu() region inside the spinlocked region to avoid the might sleep warning. BUG: sleeping function called from invalid context at kernel/rtmutex.c:684 in_atomic(): 1, irqs_disabled(): 0, pid: 10670, name: qemu-kvm Pid: 10670, comm: qemu-kvm Not tainted 2.6.31-rc9-rt9.1-32bit #47 Call Trace: [<c022a88a>] __might_sleep+0xcb/0xd0 [<c0498bd9>] rt_spin_lock+0x29/0x5e [<f9161b54>] make_all_cpus_request+0x36/0xb2 [kvm] [<f9161bf6>] kvm_flush_remote_tlbs+0x12/0x1f [kvm] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Carsten Emde <carsten.emde@osadl.org>
-
Thomas Gleixner authored
rt_read_trylock() and rt_down_read_trylock() take the lock / semaphore unconditionally when it is write locked. Check read_depth if current owns the lock. If it's 0 we know it is write locked and return 0. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Xiao Guangrong authored
If we pass a big size data over perf_counter_open() syscall, the kernel will copy this data to a small buffer, it will cause kernel crash. This bug makes the kernel unsafe and non-root local user can trigger it. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul Mackerras <paulus@samba.org> Cc: <stable@kernel.org> LKML-Reference: <4AAF37D4.5010706@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Thomas Gleixner authored
set_normalized_timespec() nsec argument is of type long. The recent timekeeping changes of ktime_get_ts() feed ts->tv_nsec + tomono.tv_nsec + nsecs to set_normalized_timespec(). On 32 bit machines that sum can be larger than (1 << 31) and therefor result in a negative value which screws up the result completely. Make the nsec argument of set_normalized_timespec() s64 to fix the problem at hand. This also prevents similar problems for future users of set_normalized_timespec(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Carsten Emde <carsten.emde@osadl.org> LKML-Reference: <new-submission> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: John Stultz <johnstul@us.ibm.com>
-
- 14 Sep, 2009 13 commits
-
-
Thomas Gleixner authored
-
Thomas Gleixner authored
Merge branch 'tip/tracing/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into rt/trace
-
Steven Rostedt authored
Parag noticed that the number of event tests has increased tremendously: grep "Testing event" dmesg.31rc9 |wc -l 100 grep "Testing event" dmesg.31git |wc -l 1172 This is due to the testing of every syscall event when ftrace self test is enabled. This adds a bit more time to kernel boot up and can affect development by slowing down the time it takes between reboots. This option makes the testing of the syscall events into a separate config, to still be able to test most of ftrace internals at boot up but not have to wait for all the syscall events to be tested. The syscall event testing only tests the enabling and disabling of the trace point, since the syscalls are not executed. What really needs to be done is to somehow have a userspace tool test the syscall tracepoints as well. Reported-by: Parag Warudkar <parag.lkml@gmail.com> LKML-Reference: <f7848160909130815l3e768a30n3b28808bbe5c254b@mail.gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
Conflicts: kernel/trace/ring_buffer.c kernel/trace/trace.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
-
Li Zefan authored
- remove FTRACE_ENTRY_STRUCT_ONLY() - remove TRACE_XXX() macros Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4AADF6E6.3080606@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Li Zefan authored
Make sure F_printk() has corrent format and args, and make sure changes in F_STRUCT() won't break F_printk(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4AADF6CC.1060809@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Li Zefan authored
I found some typos in F_printk(), so I wrote compile-time check for it, and triggered some compile errors and warnings. I've fixed them on x86_32, but I have no x86_64 in my hand, so there may still be some compile warnings on 64bits. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4AADF60B.5070407@cn.fujitsu.com> [ tested on x86_64, and works fine ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
The generated functions of TRACE_EVENT uses "flags" in one of the sub macros which shadows a parameter in the outside macro. Simple fix is to make the submacro use __flags instead. Discovered by sparse. Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
Some of the generated functions used in the TRACE_EVENT macros are not declared static, but they are not global. Discovered by sparse. Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
-
Thomas Gleixner authored
-
Steven Rostedt authored
The cmpxchg used by PowerPC does the following: ({ \ __typeof__(*(ptr)) _o_ = (o); \ __typeof__(*(ptr)) _n_ = (n); \ (__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_, \ (unsigned long)_n_, sizeof(*(ptr))); \ }) This does a type check of *ptr to both o and n. Unfortunately, the code in ring-buffer.c assigns longs to pointers and pointers to longs and causes a warning on PowerPC: ring_buffer.c: In function 'rb_head_page_set': ring_buffer.c:704: warning: initialization makes pointer from integer without a cast ring_buffer.c:704: warning: initialization makes pointer from integer without a cast ring_buffer.c: In function 'rb_head_page_replace': ring_buffer.c:797: warning: initialization makes integer from pointer without a cast This patch adds the typecasts inside cmpxchg to annotate that a long is being cast to a pointer and a pointer is being casted to a long and this removes the PowerPC warnings. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
- 13 Sep, 2009 10 commits
-
-
Steven Rostedt authored
Now that the pluging tracers use macros to create the structures and automate the exporting of their formats to the format files, they also automatically get a filter file. This patch adds the code to implement the filter logic in the trace recordings. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
The macros in trace_entries.h have made the code in trace_event_types.h obsolete. The file is no longer used, so this patch removes it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
This patch changes the way the format files in debugfs/tracing/events/ftrace/*/format are created. It uses the new trace_entries.h file to automate the creation of the format files to ensure that they are always in sync with the actual structures. This is the same methodology used to create the format files for the TRACE_EVENT macro. This also updates the filter creation that was built on the creation of the format files. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
Some of the internal ftrace structures use structures within. The output of a field saying it is just a structure is useless for a format file. A binary reader of the ring buffer needs to know more about how the fields are broken up. This patch adds to the ftrace structure macros new fields to describe the structures inside a structure. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
The entries used by ftrace internal code (plugins) currently have their formats manually exported to userspace. That is, the format files in debugfs/tracing/events/ftrace/*/format are currently created by hand. This is a maintenance nightmare, and can easily become out of sync with what is actually shown. This patch uses the methodology of the TRACE_EVENT macros to build the structures so that their formats can be automated and this will keep the structures in sync with what users can see. This patch only changes the way the structures are created. Further patches will build off of this to automate the format files. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt authored
This patch increases the max string used by predicates to handle KSYM_SYMBOL_LEN. Also moves an include to look nicer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Tom Zanussi authored
Documentation for event filters and formats. v2 changes: fix a few problems noticed by Randy Dunlap. Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1252642431.8016.9.camel@tropicana> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Li Zefan authored
If the correspoding module is unloaded before ftrace_profile_disable() is called, event->profile_disable() won't be called, which can cause oops: # insmod trace-events-sample.ko # perf record -f -a -e sample:foo_bar sleep 3 & # sleep 1 # rmmod trace_events_sample # insmod trace-events-sample.ko OOPS! Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4A9214E3.2070807@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Jiri Olsa authored
Only 24 bytes needs to be reserved on the stack for the function graph tracer on x86_64. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20090729085837.GB4998@jolsa.lab.eng.brq.redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
John Reiser authored
__start_mcount_loc[] is unused after init, yet occupies RAM forever as part of .rodata. 152kiB is typical on a 64-bit architecture. Instead, __start_mcount_loc should be in the interval [__init_begin, __init_end) so that the space is reclaimed after init. __start_mcount_loc[] is generated during the load portion of kernel build, and is used only by ftrace_init(). ftrace_init is declared '__init' and is in .init.text, which is freed after init. __start_mcount_loc is placed into .rodata by a call to MCOUNT_REC inside the RO_DATA macro of include/asm-generic/vmlinux.lds.h. The array *is* read-only, but more importantly it is not used after init. So the call to MCOUNT_REC should be moved from RO_DATA to INIT_DATA. This patch has been tested on x86_64 with CONFIG_DEBUG_PAGEALLOC=y which verifies that the address range never is accessed after init. Signed-off-by: John Reiser <jreiser@BitWagon.com> LKML-Reference: <4A6DF0B6.7080402@bitwagon.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-