1. 23 Mar, 2009 3 commits
  2. 22 Mar, 2009 7 commits
    • Ingo Molnar's avatar
      tracing: add run-time field descriptions for event filtering, kfree fix · fe9f57f2
      Ingo Molnar authored
      Impact: fix potential kfree of random data in (rare) failure path
      
      Zero-initialize the field structure.
      Reported-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <1237710639.7703.46.camel@charm-linux>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fe9f57f2
    • Tom Zanussi's avatar
      tracing: add per-subsystem filtering · cfb180f3
      Tom Zanussi authored
      This patch adds per-subsystem filtering to the event tracing subsystem.
      
      It adds a 'filter' debugfs file to each subsystem directory.  This file
      can be written to to set filters; reading from it will display the
      current set of filters set for that subsystem.
      
      Basically what it does is propagate the filter down to each event
      contained in the subsystem.  If a particular event doesn't have a field
      with the name specified in the filter, it simply doesn't get set for
      that event.  You can verify whether or not the filter was set for a
      particular event by looking at the filter file for that event.
      
      As with per-event filters, compound expressions are supported, echoing
      '0' to the subsystem's filter file clears all filters in the subsystem,
      etc.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710677.7703.49.camel@charm-linux>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cfb180f3
    • Tom Zanussi's avatar
      tracing: add per-event filtering · 7ce7e424
      Tom Zanussi authored
      This patch adds per-event filtering to the event tracing subsystem.
      
      It adds a 'filter' debugfs file to each event directory.  This file can
      be written to to set filters; reading from it will display the current
      set of filters set for that event.
      
      Basically, any field listed in the 'format' file for an event can be
      filtered on (including strings, but not yet other array types) using
      either matching ('==') or non-matching ('!=') 'predicates'.  A
      'predicate' can be either a single expression:
      
       # echo pid != 0 > filter
      
       # cat filter
       pid != 0
      
      or a compound expression of up to 8 sub-expressions combined using '&&'
      or '||':
      
       # echo comm == Xorg > filter
       # echo "&& sig != 29" > filter
      
       # cat filter
       comm == Xorg
       && sig != 29
      
      Only events having field values matching an expression will be available
      in the trace output; non-matching events are discarded.
      
      Note that a compound expression is built up by echoing each
      sub-expression separately - it's not the most efficient way to do
      things, but it keeps the parser simple and assumes that compound
      expressions will be relatively uncommon.  In any case, a subsequent
      patch introducing a way to set filters for entire subsystems should
      mitigate any need to do this for lots of events.
      
      Setting a filter without an '&&' or '||' clears the previous filter
      completely and sets the filter to the new expression:
      
       # cat filter
       comm == Xorg
       && sig != 29
      
       # echo comm != Xorg
      
       # cat filter
       comm != Xorg
      
      To clear a filter, echo 0 to the filter file:
      
       # echo 0 > filter
       # cat filter
       none
      
      The limit of 8 predicates for a compound expression is arbitrary - for
      efficiency, it's implemented as an array of pointers to predicates, and
      8 seemed more than enough for any filter...
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710665.7703.48.camel@charm-linux>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7ce7e424
    • Tom Zanussi's avatar
      tracing: add ring_buffer_event_discard() to ring buffer · 2d622719
      Tom Zanussi authored
      This patch overloads RINGBUF_TYPE_PADDING to provide a way to discard
      events from the ring buffer, for the event-filtering mechanism
      introduced in a subsequent patch.
      
      I did the initial version but thanks to Steven Rostedt for adding
      the parts that actually made it work. ;-)
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2d622719
    • Tom Zanussi's avatar
      tracing: add run-time field descriptions for event filtering · cf027f64
      Tom Zanussi authored
      This patch makes the field descriptions defined for event tracing
      available at run-time, for the event-filtering mechanism introduced
      in a subsequent patch.
      
      The common event fields are prepended with 'common_' in the format
      display, allowing them to be distinguished from the other fields
      that might internally have same name and can therefore be
      unambiguously used in filters.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1237710639.7703.46.camel@charm-linux>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cf027f64
    • Frederic Weisbecker's avatar
      tracing: keep the tracing buffer after self-test failure · 0cf53ff6
      Frederic Weisbecker authored
      Instead of using ftrace_dump_on_oops, it's far more convenient
      to have the trace leading up to a self-test failure available
      in /debug/tracing/trace.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237694675-23509-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0cf53ff6
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: prevent hangs during self-tests · cf586b61
      Frederic Weisbecker authored
      Impact: detect tracing related hangs
      
      Sometimes, with some configs, the function graph tracer can make
      the timer interrupt too much slow, hanging the kernel in an endless
      loop of timer interrupts servicing.
      
      As suggested by Ingo, this patch brings a watchdog which stops the
      selftest after a defined number of functions traced, definitely
      disabling this tracer.
      
      For those who want to debug the cause of the function graph trace
      hang, you can pass the ftrace_dump_on_oops kernel parameter to dump
      the traces after this hang detection.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237694675-23509-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cf586b61
  3. 20 Mar, 2009 5 commits
  4. 19 Mar, 2009 15 commits
    • Jeff Moyer's avatar
      aio: lookup_ioctx can return the wrong value when looking up a bogus context · 65c24491
      Jeff Moyer authored
      The libaio test harness turned up a problem whereby lookup_ioctx on a
      bogus io context was returning the 1 valid io context from the list
      (harness/cases/3.p).
      
      Because of that, an extra put_iocontext was done, and when the process
      exited, it hit a BUG_ON in the put_iocontext macro called from exit_aio
      (since we expect a users count of 1 and instead get 0).
      
      The problem was introduced by "aio: make the lookup_ioctx() lockless"
      (commit abf137dd).
      
      Thanks to Zach for pointing out that hlist_for_each_entry_rcu will not
      return with a NULL tpos at the end of the loop, even if the entry was
      not found.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Acked-by: default avatarZach Brown <zach.brown@oracle.com>
      Acked-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      65c24491
    • Davide Libenzi's avatar
      eventfd: remove fput() call from possible IRQ context · 87c3a86e
      Davide Libenzi authored
      Remove a source of fput() call from inside IRQ context.  Myself, like Eric,
      wasn't able to reproduce an fput() call from IRQ context, but Jeff said he was
      able to, with the attached test program.  Independently from this, the bug is
      conceptually there, so we might be better off fixing it.  This patch adds an
      optimization similar to the one we already do on ->ki_filp, on ->ki_eventfd.
      Playing with ->f_count directly is not pretty in general, but the alternative
      here would be to add a brand new delayed fput() infrastructure, that I'm not
      sure is worth it.
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: Zach Brown <zach.brown@oracle.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      87c3a86e
    • Linus Torvalds's avatar
      Move cc-option to below arch-specific setup · d0115552
      Linus Torvalds authored
      Sam Ravnborg says:
       "We have several architectures that plays strange games with $(CC) and
        $(CROSS_COMPILE).
      
        So we need to postpone any use of $(call cc-option..) until we have
        included the arch specific Makefile so we try with the correct $(CC)
        version."
      Requested-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0115552
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 · caa81d67
      Linus Torvalds authored
      * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
        [S390] make page table upgrade work again
        [S390] make page table walking more robust
        [S390] Dont check for pfn_valid() in uaccess_pt.c
        [S390] ftrace/mcount: fix kernel stack backchain
        [S390] topology: define SD_MC_INIT to fix performance regression
        [S390] __div64_31 broken for CONFIG_MARCH_G5
      caa81d67
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 2d8620cb
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: fix waitqueue usage in hiddev
        HID: fix incorrect free in hiddev
      2d8620cb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable · fe2fd6cc
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
        Btrfs: Clear space_info full when adding new devices
        Btrfs: Fix locking around adding new space_info
      fe2fd6cc
    • Steven Rostedt's avatar
      function-graph: show binary events as comments · 5087f8d2
      Steven Rostedt authored
      With the added TRACE_EVENT macro, the events no longer appear in
      the function graph tracer. This was because the function graph
      did not know how to display the entries. The graph tracer was
      only aware of its own entries and the printk entries.
      
      By using the event call back feature, the graph tracer can now display
      the events.
      
       # echo irq > /debug/tracing/set_event
      
      Which can show:
      
       0)               |          handle_IRQ_event() {
       0)               |            /* irq_handler_entry: irq=48 handler=eth0 */
       0)               |            e1000_intr() {
       0)   0.926 us    |              __napi_schedule();
       0)   3.888 us    |            }
       0)               |            /* irq_handler_exit: irq=48 return=handled */
       0)   0.655 us    |            runqueue_is_locked();
       0)               |            __wake_up() {
       0)   0.831 us    |              _spin_lock_irqsave();
      
      The irq entry and exit events show up as comments.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      5087f8d2
    • Steven Rostedt's avatar
      tracing: remove recording function depth from trace_printk · 40ce74f1
      Steven Rostedt authored
      The function depth in trace_printk was to facilitate the function
      graph output. Now that the function graph calculates the depth within
      the trace output, we no longer need to record the depth when the
      trace_printk is called.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      40ce74f1
    • Steven Rostedt's avatar
      function-graph: calculate function depth within function graph tracer · 2fbcdb35
      Steven Rostedt authored
      Currently, the function graph tracer depends on the trace_printk
      to record the depth. All the information is already there in the trace
      to calculate function depth, with the exception of having the printk
      be the first item. But as soon as a entry or exit is reached, then
      we know the depth.
      
      This patch changes the iter->private data from recording a per cpu
      last_pid, to a structure that holds both the last_pid and the current
      depth. This data is used to determine the function depth for the
      printks.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      2fbcdb35
    • Steven Rostedt's avatar
      tracing: make print_(b)printk_msg_only global · 5ef841f6
      Steven Rostedt authored
      This patch makes print_printk_msg_only and print_bprintk_msg_only
      global for other functions to use. It also renames them by adding
      a "trace_" to the beginning to avoid namespace collisions.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      5ef841f6
    • Linus Torvalds's avatar
      Fix race in create_empty_buffers() vs __set_page_dirty_buffers() · a8e7d49a
      Linus Torvalds authored
      Nick Piggin noticed this (very unlikely) race between setting a page
      dirty and creating the buffers for it - we need to hold the mapping
      private_lock until we've set the page dirty bit in order to make sure
      that create_empty_buffers() might not build up a set of buffers without
      the dirty bits set when the page is dirty.
      
      I doubt anybody has ever hit this race (and it didn't solve the issue
      Nick was looking at), but as Nick says: "Still, it does appear to solve
      a real race, which we should close."
      Acked-by: default avatarNick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8e7d49a
    • Linus Torvalds's avatar
      Add '-fwrapv' to gcc CFLAGS · 68df3755
      Linus Torvalds authored
      This makes sure that gcc doesn't try to optimize away wrapping
      arithmetic, which the kernel occasionally uses for overflow testing, ie
      things like
      
      	if (ptr + offset < ptr)
      
      which technically is undefined for non-unsigned types. See
      
      	http://bugzilla.kernel.org/show_bug.cgi?id=12597
      
      for details.
      
      Not all versions of gcc support it, so we need to make it conditional
      (it looks like it was introduced in gcc-3.4).
      Reminded-by: default avatarAlan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      68df3755
    • Frederic Weisbecker's avatar
      tracing/ring-buffer: fix non cpu hotplug case · 3bf832ce
      Frederic Weisbecker authored
      Impact: fix warning with irqsoff tracer
      
      The ring buffer allocates its buffers on pre-smp time (early_initcall).
      It means that, at first, only the boot cpu buffer is allocated and
      the ring-buffer cpumask only has the boot cpu set (cpu_online_mask).
      
      Later, the secondary cpu will show up and the ring-buffer will be notified
      about this event: the appropriate buffer will be allocated and the cpumask
      will be updated.
      
      Unfortunately, if !CONFIG_CPU_HOTPLUG, the ring-buffer will not be
      notified about the secondary cpus, meaning that the cpumask will have
      only the cpu boot set, and only one cpu buffer allocated.
      
      We fix that by using cpu_possible_mask if !CONFIG_CPU_HOTPLUG.
      
      This patch fixes the following warning with irqsoff tracer running:
      
      [  169.317794] WARNING: at kernel/trace/trace.c:466 update_max_tr_single+0xcc/0xf3()
      [  169.318002] Hardware name: AMILO Li 2727
      [  169.318002] Modules linked in:
      [  169.318002] Pid: 5624, comm: bash Not tainted 2.6.29-rc8-tip-02636-g6aafa6c #11
      [  169.318002] Call Trace:
      [  169.318002]  [<ffffffff81036182>] warn_slowpath+0xea/0x13d
      [  169.318002]  [<ffffffff8100b9d6>] ? ftrace_call+0x5/0x2b
      [  169.318002]  [<ffffffff8100b9d6>] ? ftrace_call+0x5/0x2b
      [  169.318002]  [<ffffffff8100b9d1>] ? ftrace_call+0x0/0x2b
      [  169.318002]  [<ffffffff8101ef10>] ? ftrace_modify_code+0xa9/0x108
      [  169.318002]  [<ffffffff8106e27f>] ? trace_hardirqs_off+0x25/0x27
      [  169.318002]  [<ffffffff8149afe7>] ? _spin_unlock_irqrestore+0x1f/0x2d
      [  169.318002]  [<ffffffff81064f52>] ? ring_buffer_reset_cpu+0xf6/0xfb
      [  169.318002]  [<ffffffff8106637c>] ? ring_buffer_reset+0x36/0x48
      [  169.318002]  [<ffffffff8106aeda>] update_max_tr_single+0xcc/0xf3
      [  169.318002]  [<ffffffff8100bc17>] ? sysret_check+0x22/0x5d
      [  169.318002]  [<ffffffff8106e3ea>] stop_critical_timing+0x142/0x204
      [  169.318002]  [<ffffffff8106e4cf>] trace_hardirqs_on_caller+0x23/0x25
      [  169.318002]  [<ffffffff8149ac28>] trace_hardirqs_on_thunk+0x3a/0x3c
      [  169.318002]  [<ffffffff8100bc17>] ? sysret_check+0x22/0x5d
      [  169.318002] ---[ end trace db76cbf775a750cf ]---
      
      Because this tracer may try to swap two cpu ring buffers for an
      unregistered cpu on the ring buffer.
      
      This patch might also fix a fair loss of traces due to unallocated buffers
      for secondary cpus.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-b: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1237470453-5427-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3bf832ce
    • Steven Rostedt's avatar
      function-graph: consolidate prologues for output · ac5f6c96
      Steven Rostedt authored
      Impact: clean up
      
      The prologue of the function graph entry, return and comments all
      start out pretty much the same. Each of these duplicate code and
      do so slightly differently.
      
      This patch consolidates the printing of the pid, absolute time,
      cpu and proc (and for entry, the interrupt).
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      ac5f6c96
    • Lai Jiangshan's avatar
      ftrace: protect running nmi (V3) · e9d9df44
      Lai Jiangshan authored
      When I review the sensitive code ftrace_nmi_enter(), I found
      the atomic variable nmi_running does protect NMI VS do_ftrace_mod_code(),
      but it can not protects NMI(entered nmi) VS NMI(ftrace_nmi_enter()).
      
      cpu#1                   | cpu#2                 | cpu#3
      ftrace_nmi_enter()      | do_ftrace_mod_code()  |
        not modify            |                       |
      ------------------------|-----------------------|--
      executing               | set mod_code_write = 1|
      executing             --|-----------------------|--------------------
      executing               |                       | ftrace_nmi_enter()
      executing               |                       |    do modify
      ------------------------|-----------------------|-----------------
      ftrace_nmi_exit()       |                       |
      
      cpu#3 may be being modified the code which is still being executed on cpu#1,
      it will have undefined results and possibly take a GPF, this patch
      prevents it occurred.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <49C0B411.30003@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      e9d9df44
  5. 18 Mar, 2009 10 commits