1. 09 Aug, 2009 23 commits
    • Peter Zijlstra's avatar
      perf_counter: Fix a race on perf_counter_ctx · 3a80b4a3
      Peter Zijlstra authored
      While extending perfcounters with BTS hw-tracing, Markus
      Metzger managed to trigger this warning:
      
         [  995.557128] WARNING: at kernel/perf_counter.c:1191 __perf_counter_task_sched_out+0x48/0x6b()
      
      triggers because commit
      9f498cc5 (perf_counter: Full
      task tracing) removed clearing of tsk->perf_counter_ctxp out
      from under ctx->lock which introduced a race (against
      perf_lock_task_context).
      
      Move it back and deal with the exit notification by explicitly
      passing along the former task context.
      Reported-by: default avatarMarkus T Metzger <markus.t.metzger@intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249667341.17467.5.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a80b4a3
    • Frederic Weisbecker's avatar
      perf_counter: Fix tracepoint sampling to be part of generic sampling · 3a43ce68
      Frederic Weisbecker authored
      Based on Peter's comments, make tracepoint sampling generic
      just like all the other sampling bits are. This is a rename
      with no code changes:
      
      - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW
      - struct perf_tracepoint_record to perf_raw_record
      
      We want the system in place that transport tracepoints raw
      samples events into the perf ring buffer to be generalized and
      usable by any type of counter.
      
      Reported-by; Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a43ce68
    • Frederic Weisbecker's avatar
      perf_counter: Work around gcc warning by initializing tracepoint record unconditionally · 10b8e306
      Frederic Weisbecker authored
      Despite that the tracepoint record is always present when the
      PERF_SAMPLE_TP_RECORD flag is set, gcc raises a warning,
      thinking it might not be initialized:
      
        kernel/perf_counter.c: In function ‘perf_counter_output’:
        kernel/perf_counter.c:2650: warning: ‘tp’ may be used uninitialized in this function
      
      Then, initialize it to NULL and always check if it's not NULL
      before dereference it.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      10b8e306
    • Frederic Weisbecker's avatar
      perf tools: callchain: Fix sum of percentages to be 100% by displaying amount... · 25446036
      Frederic Weisbecker authored
      perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
      
      When we filter the callchains below a given percentage, we
      ignore them and the end result only shows entries that have an
      upper percentage than the filter threshold.
      
      It seems to users then that we have an imbalance in the
      percentage, as if the sum inside a profiled branch doesn't
      reach 100%.
      
      Since in the past there have been real perf report bugs that
      showed the same sypmtom, it would be nice to assure the user
      that the data is perfect and trustable and it all sums up to
      100.00%.
      
      So fix this by displaying the remaining hits that have been
      filtered but without more detail than their amount in each
      branches. Example while filtering below 50%:
      
      7.73%  [k] delay_tsc
                      |
                      |--98.22%-- __const_udelay
                      |          |
                      |          |--86.37%-- ath5k_hw_register_timeout
                      |          |          ath5k_hw_noise_floor_calibration
                      |          |          ath5k_hw_reset
                      |          |          ath5k_reset
                      |          |          ath5k_config
                      |          |          ieee80211_hw_config
                      |          |          |
                      |          |          |--88.53%-- ieee80211_scan_work
                      |          |          |          worker_thread
                      |          |          |          kthread
                      |          |          |          child_rip
                      |          |           --11.47%-- [...]
                      |           --13.63%-- [...]
                       --1.78%-- [...]
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249690585-9145-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      25446036
    • Frederic Weisbecker's avatar
      perf tools: callchain: Fix 'perf report' display to be callchain by default · b1a88349
      Frederic Weisbecker authored
      If we recorded with -g option to record the callchain, right now
      we require a -g option to perf report as well - and people reported
      this as unnecessary complication: the user already specified -g
      once, no need to require it a second time.
      
      So if the recording includes call-chains, display the callchain by
      default from perf report.
      
      ( The user can override this default using "-g none" option from
        perf report. )
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249690585-9145-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b1a88349
    • Frederic Weisbecker's avatar
      perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty callchains · b0efe213
      Frederic Weisbecker authored
      When the callchain tree comes to insert an empty backtrace, it
      raises a spurious warning about the fact we are inserting an
      empty. This is spurious because the radix tree assumes it did
      something wrong to reach this state. But it didn't, we just met
      an empty callchain that has to be ignored.
      
      This happens occasionally with certain types of call-chain
      recordings. If it happens it's a big nuisance as perf report
      output starts with thousands of warning lines.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249690585-9145-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b0efe213
    • Pierre Habouzit's avatar
      perf record: Fix the -A UI for empty or non-existent perf.data · 266e0e21
      Pierre Habouzit authored
      1. Ignore the -A argument if there is no perf.data file
      2. Treat an empty file like a non existent file.
      
      Else, perf will try to read the perf.data header, and fail with
      an error.
      
      Treating an empty file like a non-existent file makes sense,
      since an interupted (as in SIGKILLed) perf could leave such
      files around, and you don't want to annoy the user with errors
      for files with no data in it.
      Signed-off-by: default avatarPierre Habouzit <pierre.habouzit@intersec.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      266e0e21
    • Pierre Habouzit's avatar
      perf util: Fix do_read() to fail on EOF instead of busy-looping · 7eac7e9e
      Pierre Habouzit authored
      While toying with perf, I've noticed that perf record can
      easily enter a busy loop when doing something as silly as:
      
          $ perf record -A ls
      
      Yeah, do_read here really wants to read a known size, not being
      able to should die(), not busy-loop ;)
      
      That was the cause for the bug.
      Signed-off-by: default avatarPierre Habouzit <pierre.habouzit@intersec.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7eac7e9e
    • Peter Zijlstra's avatar
      perf list: Fix the output to not include tracepoints without an id · ae07b63f
      Peter Zijlstra authored
      Stop perf list from displaying tracepoints without an id file,
      those are special tracepoints that are not interfaced to
      perfcounters so listing them is erroneous and passing them as
      events will produce no output.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ae07b63f
    • Paul Mackerras's avatar
      perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support · f36a1a13
      Paul Mackerras authored
      If we have the powerpc perf_counter backend compiled in, but
      the cpu we are running on is one where we don't support the
      PMU, we currently oops in hw_perf_group_sched_in if we try to
      use any counters, because ppmu is NULL in that case, and we
      unconditionally dereference ppmu.
      
      This fixes the problem by adding a check if ppmu is NULL at the
      beginning of hw_perf_group_sched_in, and also at the beginning
      of the other functions that get called from the perf_counter
      core, i.e. hw_perf_disable, hw_perf_enable, and
      hw_perf_counter_setup.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: benh@kernel.crashing.org
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f36a1a13
    • Brice Goglin's avatar
      perf stat: Fix tool option consistency: rename -S/--scale to -c/--scale · b26bc5a7
      Brice Goglin authored
      We want to use a coherent flag for -S/--stat across all tools,
      so free up -S in perf stat.
      Signed-off-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b26bc5a7
    • Arnaldo Carvalho de Melo's avatar
      perf report: Add debug help for the finding of symbol bugs - show the symtab... · 94cb9e38
      Arnaldo Carvalho de Melo authored
      perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc)
      
      Used with perf report --verbose:
      
      [acme@doppio linux-2.6-tip]$ perf report -v | head -16
           5.17%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005d8eee f [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
           2.56%  firefox  /lib64/libpthread-2.10.1.so            0x0000000000008e02 d [.] __pthread_mutex_lock_internal
           1.94%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x0000000000d0af8f f [.] SearchTable
           1.75%  firefox  [kernel]                               0xffffffffff60013b k [.] vread_hpet
           1.63%  firefox  /lib64/libpthread-2.10.1.so            0x000000000000a404 d [.] __pthread_mutex_unlock
           1.47%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000482ea f [.] js_Interpret
           1.42%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x000000000003eda3 f [.] JS_CallTracer
           1.24%  firefox  [kernel]                               0xffffffff8102ca4a k [k] read_hpet
           1.16%  firefox  [kernel]                               0xffffffff810f3dd4 k [k] fget_light
           1.11%  firefox  /usr/lib64/xulrunner-1.9.1/libmozjs.so 0x00000000000567ff f [.] js_TraceObject
           0.98%  firefox  /usr/lib64/firefox-3.5.2/firefox       0x000000000000dd23 b [.] arena_ralloc
      [acme@doppio linux-2.6-tip]$
      
      The new field is just after the symbol address. To help in
      figuring out symbol resolution bugs.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      94cb9e38
    • Peter Zijlstra's avatar
      perf report: Fix per task mult-counter stat reporting · 8f18aec5
      Peter Zijlstra authored
      Brice Goglin reported:
      
      > I can easily sort them by thread id, but I don't know how to match
      > my 4 events with each group of 4 lines.
      
      Also report the counter id and the time running/enabled
      stats (in case the counter got time-shared).
      Reported-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8f18aec5
    • Peter Zijlstra's avatar
      perf tools: Fix multi-counter stat bug caused by incorrect reading of perf.data file header · 1c222bce
      Peter Zijlstra authored
      Brice Goglin reported that only the first result from a
      multi-counter perf record --stat run is accurate, the
      rest looks bogus.
      
      A silly mistake made us re-read the first attribute for
      every recorded attribute.
      Reported-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      Cc: paulus@samba.org
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1c222bce
    • Frederic Weisbecker's avatar
      perf tools: Fix call-chain cumul hit based sub-total (fractal mode) · 1953287b
      Frederic Weisbecker authored
      The callchain fractal mode builds each new total hits in a new
      branch of profiling by using the parent's hits of the current
      branch plus the hits of the children.
      
      This is wrong, the total hits of a branch should be made of the
      sum of every children hits, we must ignore the parent hits in
      this scope.
      
      This patch also fixes another mistake with the hit counting.
      
      Now the rates are correct.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1953287b
    • Mike Galbraith's avatar
      perf top: Update man page · 83617983
      Mike Galbraith authored
      perf_counter tools: update perf top manual page to reflect
      current implementation.
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      83617983
    • Mike Galbraith's avatar
      perf top: Improve interactive key handling · 091bd2e9
      Mike Galbraith authored
      Pressing any key which is not currently mapped to
      functionality, based on startup command line options, displays
      currently mapped keys, and prompts for input.
      
      Pressing any unmapped key at the prompt returns the user to
      display mode with variables unchanged.  eg, pressing ? <SPACE>
      <ESC> etc displays currently available keys, the value of the
      variable associated with that key, and prompts.
      
      Pressing same again aborts input.
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      091bd2e9
    • Peter Zijlstra's avatar
      perf_counter: Fix software counters for fast moving event sources · 7b4b6658
      Peter Zijlstra authored
      Reimplement the software counters to deal with fast moving
      event sources (such as tracepoints). This means being able
      to generate multiple overflows from a single 'event' as well
      as support throttling.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7b4b6658
    • Mike Galbraith's avatar
      perf_counter tools: Allow perf top top users to switch between weighted and... · 46ab9764
      Mike Galbraith authored
      perf_counter tools: Allow perf top top users to switch between weighted and individual counter display
      
      Add [w]eighted hotkey.  Pressing [w] toggles between displaying
      weighted total of all counters, and the counter selected via
      [E]vent select key.
      
      ------------------------------------------------------------------------------
         PerfTop:   90395 irqs/sec  kernel:16.1% [cache-misses/cache-references/instructions],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
        weight     samples    pcnt         RIP          kernel function
        ______     _______   _____   ________________   _______________
      
      1275408.6      10881 -  5.3% - ffffffff81146f70 : copy_page_c
       553683.4      43569 - 21.3% - ffffffff81146f20 : clear_page_c
        74075.0       6768 -  3.3% - ffffffff81147190 : copy_user_generic_string
        40602.9       7538 -  3.7% - ffffffff81284ba2 : _spin_lock
        26882.1        965 -  0.5% - ffffffff8109d280 : file_ra_state_init
      
      [w]
      
      ------------------------------------------------------------------------------
         PerfTop:   91221 irqs/sec  kernel:14.5% [10000Hz cache-misses],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
        weight     samples    pcnt         RIP          kernel function
        ______     _______   _____   ________________   _______________
      
                  47320.00 - 22.3% - ffffffff81146f20 : clear_page_c
                  14261.00 -  6.7% - ffffffff810992f5 : __rmqueue
                  11046.00 -  5.2% - ffffffff81146f70 : copy_page_c
                   7842.00 -  3.7% - ffffffff81284ba2 : _spin_lock
                   7234.00 -  3.4% - ffffffff810aa1d6 : unmap_vmas
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      46ab9764
    • Mike Galbraith's avatar
      perf_counter tools: Fix/resurrect perf top annotation in a simple interactive form · 923c42c1
      Mike Galbraith authored
      perf top used to have annotation support, but it has bitrotted and
      removed.
      
      This patch restores that: it allows the user to select any symbol
      in kernel space for source level annotation on the fly, switch
      between event counters and alter display variables. When symbol
      details are being displayed, stopping annotation reverts to normal.
      
      known keys:
              [d]     select display delay.
              [e]     select display entries (lines).
              [E]     select annotation event counter.
              [f]     select normal display count filter.
              [F]     select annotation display count filter (percentage).
              [qQ]    quit.
              [s]     select annotation symbol and start annotation.
              [S]     stop annotation, revert to normal display.
              [z]     toggle event count zeroing.
      
      Sample:
      ------------------------------------------------------------------------------
         PerfTop:   16719 irqs/sec  kernel:78.7% [cache-misses/cache-references/instructions/cycles],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
      Showing cache-misses for e1000_clean_rx_irq
        Events  Pcnt (>=3%)
             0  0.0%                  /* adjust length to remove Ethernet CRC */
             0  0.0%                  if (!(adapter->flags2 & FLAG2_CRC_STRIPPING))
             0  0.0%                          length -= 4;
           436  5.0%      f039:       41 f6 84 24 5c 29 00    testb  $0x1,0x295c(%r12)
             0  0.0%      f089:       8b 4d 84                mov    -0x7c(%rbp),%ecx
             0  0.0%      f08c:       48 83 ef 02             sub    $0x2,%rdi
             0  0.0%      f090:       48 83 ee 02             sub    $0x2,%rsi
           811  9.3%      f094:       f3 a4                   rep movsb %ds:(%rsi),%es:(%rdi)
             0  0.0%
             0  0.0%          while (rx_desc->status & E1000_RXD_STAT_DD) {
             0  0.0%      f114:       41 f6 47 0c 01          testb  $0x1,0xc(%r15)
          7226 82.6%      f119:       0f 85 24 fe ff ff       jne    ef43 <e1000_clean_rx_irq+0x84>
      
      Available events:
              0 cache-misses
              1 cache-references
              2 instructions
              3 cycles
      Enter details event counter: 2
      ------------------------------------------------------------------------------
         PerfTop:   15035 irqs/sec  kernel:79.0% [cache-misses/cache-references/instructions/cycles],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
      Showing instructions for e1000_clean_rx_irq
        Events  Pcnt (>=3%)
             0  0.0%                                 int *work_done, int work_to_do)
             0  0.0%  {
           175  0.9%      eebf:       55                      push   %rbp
          1898  9.8%      eec0:       48 89 e5                mov    %rsp,%rbp
             0  0.0%
             0  0.0%          i = rx_ring->next_to_clean;
           140  0.7%      ef0a:       0f b7 41 1a             movzwl 0x1a(%rcx),%eax
           670  3.4%      ef0e:       89 45 ac                mov    %eax,-0x54(%rbp)
             0  0.0%  {
             0  0.0%          memcpy(skb->data + offset, from, len);
            91  0.5%      f07b:       49 8b b6 e8 00 00 00    mov    0xe8(%r14),%rsi
          1153  5.9%      f082:       48 8b b8 e8 00 00 00    mov    0xe8(%rax),%rdi
            42  0.2%      f089:       8b 4d 84                mov    -0x7c(%rbp),%ecx
            14  0.1%      f08c:       48 83 ef 02             sub    $0x2,%rdi
             0  0.0%      f090:       48 83 ee 02             sub    $0x2,%rsi
          1618  8.3%      f094:       f3 a4                   rep movsb %ds:(%rsi),%es:(%rdi)
             0  0.0%
             0  0.0%                  /* return some buffers to hardware, one at a time is too slow */
             0  0.0%                  if (cleaned_count >= E1000_RX_BUFFER_WRITE) {
           867  4.5%      f0e7:       83 7d b0 0f             cmpl   $0xf,-0x50(%rbp)
             0  0.0%
             0  0.0%          while (rx_desc->status & E1000_RXD_STAT_DD) {
            37  0.2%      f114:       41 f6 47 0c 01          testb  $0x1,0xc(%r15)
          4047 20.8%      f119:       0f 85 24 fe ff ff       jne    ef43 <e1000_clean_rx_irq+0x84>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      923c42c1
    • Frederic Weisbecker's avatar
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker authored
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f413cdb8
    • Peter Zijlstra's avatar
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra authored
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a659305
    • Ingo Molnar's avatar
      Merge branch 'linus' into tracing/urgent · e3560336
      Ingo Molnar authored
      Merge reason: Merge up to almost-rc6 to pick up latest perfcounters
                    (on which we'll queue up a dependent fix)
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e3560336
  2. 08 Aug, 2009 7 commits
  3. 07 Aug, 2009 10 commits