Commits · ee06094f8279e1312fc0a31591320cc7b6f0ab1e · linux / linux-davinci

14 Dec, 2008 2 commits

perfcounters: restructure x86 counter math · ee06094f

Ingo Molnar authored Dec 13, 2008

Impact: restructure code

Change counter math from absolute values to clear delta logic.

We try to extract elapsed deltas from the raw hw counter - and put
that into the generic counter.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ee06094f

x86: implement atomic64_t on 32-bit · 9b194e83

Ingo Molnar authored Dec 14, 2008

Impact: new API

Implement the atomic64_t APIs on 32-bit as well. Will be used by
the performance counters code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

9b194e83

12 Dec, 2008 3 commits
- Merge branch 'x86/irq' into perfcounters/core · 92bf73e9
  Ingo Molnar authored Dec 12, 2008
```
( with manual semantic merge of arch/x86/kernel/cpu/perf_counter.c )
```
  92bf73e9
- x86: hardirq: introduce inc_irq_stat() · 915b0d01
  Hiroshi Shimamoto authored Dec 08, 2008
```
Impact: cleanup

Introduce inc_irq_stat() macro and unify irq_stat accounting code.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
```
  915b0d01
- Merge commit 'v2.6.28-rc8' into x86/irq · fd109027
  Ingo Molnar authored Dec 12, 2008
  
  fd109027
11 Dec, 2008 13 commits

perf counters: update docs · 447557ac
Ingo Molnar authored Dec 11, 2008
```
Impact: update docs
Signed-off-by: Ingo Molnar <mingo@elte.hu>
```
447557ac

perf counters: clean up state transitions · 6a930700

Ingo Molnar authored Dec 11, 2008

Impact: cleanup

Introduce a proper enum for the 3 states of a counter:

	PERF_COUNTER_STATE_OFF		= -1
	PERF_COUNTER_STATE_INACTIVE	=  0
	PERF_COUNTER_STATE_ACTIVE	=  1

and rename counter->active to counter->state and propagate the
changes everywhere.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

6a930700

perf counters: add prctl interface to disable/enable counters · 1d1c7ddb

Ingo Molnar authored Dec 11, 2008

Add a way for self-monitoring tasks to disable/enable counters summarily,
via a prctl:

	PR_TASK_PERF_COUNTERS_DISABLE		31
	PR_TASK_PERF_COUNTERS_ENABLE		32
Signed-off-by: Ingo Molnar <mingo@elte.hu>

1d1c7ddb

perf counters: implement PERF_COUNT_TASK_CLOCK · bae43c99

Ingo Molnar authored Dec 11, 2008

Impact: add new perf-counter type

The 'task clock' counter counts the amount of time a task is executing,
in nanoseconds. It stops ticking when a task is scheduled out either due
to it blocking, sleeping or it being preempted.

This counter type is a Linux kernel based abstraction, it is available
even if the hardware does not support native hardware performance counters.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

bae43c99

perf counters: consolidate hw_perf save/restore APIs · 01b2838c

Ingo Molnar authored Dec 11, 2008

Impact: cleanup

Rename them to better match up the usual IRQ disable/enable APIs:

 hw_perf_disable_all()  => hw_perf_save_disable()
 hw_perf_restore_ctrl() => hw_perf_restore()
Signed-off-by: Ingo Molnar <mingo@elte.hu>

01b2838c

perf counters: implement PERF_COUNT_CPU_CLOCK · 5c92d124

Ingo Molnar authored Dec 11, 2008

Impact: add new perf-counter type

The 'CPU clock' counter counts the amount of CPU clock time that is
elapsing, in nanoseconds. (regardless of how much of it the task is
spending on a CPU executing)

This counter type is a Linux kernel based abstraction, it is available
even if the hardware does not support native hardware performance counters.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

5c92d124

perf counters: hw driver API · 621a01ea

Ingo Molnar authored Dec 11, 2008

Impact: restructure code, introduce hw_ops driver abstraction

Introduce this abstraction to handle counter details:

 struct hw_perf_counter_ops {
	void (*hw_perf_counter_enable)	(struct perf_counter *counter);
	void (*hw_perf_counter_disable)	(struct perf_counter *counter);
	void (*hw_perf_counter_read)	(struct perf_counter *counter);
 };

This will be useful to support assymetric hw details, and it will also
be useful to implement "software counters". (Counters that count kernel
managed sw events such as pagefaults, context-switches, wall-clock time
or task-local time.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>

621a01ea

perf counters: group counter, fixes · ccff286d

Ingo Molnar authored Dec 11, 2008

Impact: bugfix

Check that a group does not span outside the context of a CPU or a task.

Also, do not allow deep recursive hierarchies.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ccff286d

perf counters: add support for group counters · 04289bb9

Ingo Molnar authored Dec 11, 2008

Impact: add group counters

This patch adds the "counter groups" abstraction.

Groups of counters behave much like normal 'single' counters, with a
few semantic and behavioral extensions on top of that.

A counter group is created by creating a new counter with the open()
syscall's group-leader group_fd file descriptor parameter pointing
to another, already existing counter.

Groups of counters are scheduled in and out in one atomic group, and
they are also roundrobin-scheduled atomically.

Counters that are member of a group can also record events with an
(atomic) extended timestamp that extends to all members of the group,
if the record type is set to PERF_RECORD_GROUP.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

04289bb9

perf counters: restructure the API · 9f66a381

Ingo Molnar authored Dec 10, 2008

Impact: clean up new API

Thorough cleanup of the new perf counters API, we now get clean separation
of the various concepts:

 - introduce perf_counter_hw_event to separate out the event source details

 - move special type flags into separate attributes: PERF_COUNT_NMI,
   PERF_COUNT_RAW

 - extend the type to u64 and reserve it fully to the architecture in the
   raw type case.

And make use of all these changes in the core and x86 perfcounters code.

Also change the syscall signature to:

  asmlinkage int sys_perf_counter_open(

	struct perf_counter_hw_event	*hw_event_uptr		__user,
	pid_t				pid,
	int				cpu,
	int				group_fd);

( Note that group_fd is unused for now - it's reserved for the counter
  groups abstraction. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>

9f66a381

perf counters: expand use of counter->event · dfa7c899

Thomas Gleixner authored Dec 08, 2008

Impact: change syscall, cleanup

Make use of the new perf_counters event type.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

dfa7c899

perf counters: clean up 'raw' type API · eab656ae

Thomas Gleixner authored Dec 08, 2008

Impact: cleanup

Introduce a separate hw_event type.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

eab656ae

perf counters: protect them against CSTATE transitions · 4ac13294

Thomas Gleixner authored Dec 09, 2008

Impact: fix rare lost events problem

There are CPUs whose performance counters misbehave on CSTATE transitions,
so provide a way to just disable/enable them around deep idle methods.

(hw_perf_enable_all() is cheap on x86.)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

4ac13294

10 Dec, 2008 22 commits

Linux 2.6.28-rc8 · 8b1fae4e
Linus Torvalds authored Dec 10, 2008

8b1fae4e

Merge branch 'sched-fixes-for-linus' of... · f9fc05e7

Linus Torvalds authored Dec 10, 2008

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: CPU remove deadlock fix

f9fc05e7

fix mapping_writably_mapped() · b88ed205

Hugh Dickins authored Dec 10, 2008

Lee Schermerhorn noticed yesterday that I broke the mapping_writably_mapped
test in 2.6.7! Bad bad bug, good good find.

The i_mmap_writable count must be incremented for VM_SHARED (just as
i_writecount is for VM_DENYWRITE, but while holding the i_mmap_lock)
when dup_mmap() copies the vma for fork: it has its own more optimal
version of __vma_link_file(), and I missed this out. So the count
was later going down to 0 (dangerous) when one end unmapped, then
wrapping negative (inefficient) when the other end unmapped.

The only impact on x86 would have been that setting a mandatory lock on
a file which has at some time been opened O_RDWR and mapped MAP_SHARED
(but not necessarily PROT_WRITE) across a fork, might fail with -EAGAIN
when it should succeed, or succeed when it should fail.

But those architectures which rely on flush_dcache_page() to flush
userspace modifications back into the page before the kernel reads it,
may in some cases have skipped the flush after such a fork - though any
repetitive test will soon wrap the count negative, in which case it will
flush_dcache_page() unnecessarily.

Fix would be a two-liner, but mapping variable added, and comment moved.
Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b88ed205

Merge branch 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland · f4fd2c5b
Linus Torvalds authored Dec 10, 2008
```
* 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
  tracehook: exec double-reporting fix
```
f4fd2c5b

lib/idr.c: Fix bug introduced by RCU fix · 711a49a0

Manfred Spraul authored Dec 10, 2008

The last patch to lib/idr.c caused a bug if idr_get_new_above() was
called on an empty idr.

Usually, nodes stay on the same layer.  New layers are added to the top
of the tree.

The exception is idr_get_new_above() on an empty tree: In this case, the
new root node is first added on layer 0, then moved upwards.  p->layer
was not updated.

As usual: You shall never rely on the source code comments, they will
only mislead you.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

711a49a0

MN10300: Give correct size when reserving interrupt vector table · c7f8d6f6

Akira Takeuchi authored Dec 10, 2008

Give the correct size when reserving the interrupt vector table.  It should be
a page not a single byte.
Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c7f8d6f6

MN10300: Fix __put_user_asm8() · 54b71fba

Akira Takeuchi authored Dec 10, 2008

Fix __put_user_asm8() by jumping to the end label (3:) from the exception
handler, rather than jumping back to retry the second store instruction (label
2:).
Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

54b71fba

MN10300: Fix the preemption resume_kernel() routine · 24646bd2

Akira Takeuchi authored Dec 10, 2008

Fix the preemption resume_kernel() routine by inverting the test to see
whether interrupts are off (IM7 is all enabled, not all disabled).

Furthermore, interrupts should be disabled on entry to resume_kernel() so that
they're correctly set for jumping to restore_all() and doing the need
reschedule test.
Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

24646bd2

MN10300: Discard low-priority Tx interrupts when closing an on-chip serial port · a8893fb3

Akira Takeuchi authored Dec 10, 2008

Discard low-prioriy Tx interrupts when closing an MN10300 on-chip serial port.

The MN10300 on-chip serial port uses three interrupts to manage its serial
ports:

 (1) A very high priority interrupt that drives virtual DMA for Rx.

 (2) A very high priority interrupt that drives virtual DMA for Tx.

 (3) A normal priority virtual interrupt that does the normal UART interrupt
     stuff and is shared between Rx and Tx.

mn10300_serial_stop_tx() only disables the high priority Tx interrupt.  It
doesn't also disable the normal priority one because it is shared with Rx.

However, the high priority interrupt may interrupt local_irq_disabled()
sections, and so may have queued up a low priority virtual interrupt whilst the
UART driver is asking for the Tx interrupt to be disabled.

The result of this can be an oops when we try to process the interrupt in
mn10300_serial_transmit_interrupt() as port->uart.info and port->uart.info->tty
may have gone away.

To deal with this, if either of those pointers is NULL, we make sure the
high-priority Tx interrupt is disabled and discard the interrupt.  The low
priority interrupt is disabled by the mn10300_serial_pic irq_chip table.
Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a8893fb3

MN10300: vmlinux.lds.S cleanup - use PAGE_SIZE, PERCPU macros · cb32898c

Cyrill Gorcunov authored Dec 10, 2008

Include the linux/page.h header into the MN10300 kernel linker script thus
allowing us to use PAGE_SIZE macro instead of a numeric constant.

Also use the PERCPU macro instead of an explicit section definition.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cb32898c

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 4e6f2ba9

Linus Torvalds authored Dec 10, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: api - Disallow cryptomgr as a module if algorithms are built-in

4e6f2ba9

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 · 44f6cc31

Linus Torvalds authored Dec 10, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch
  PCI: stop leaking 'slot_name' in pci_create_slot

44f6cc31

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · 061afe9f

Linus Torvalds authored Dec 10, 2008

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] SN: prevent IRQ retargetting in request_irq()
  [IA64] Fix section mismatch ioc3uart_init()/ioc3uart_submodule
  [IA64] Clear up section mismatch for ioc4_ide_attach_one.
  [IA64] Clear up section mismatch with arch_unregister_cpu()
  [IA64] Clear up section mismatch for sn_check_wars.
  [IA64] Updated the generic_defconfig to work with the 2.6.28-rc7 kernel.
  [IA64] Fix GRU compile error w/o CONFIG_HUGETLB_PAGE
  [IA64] eliminate NULL test and memset after alloc_bootmem
  [IA64] remove BUILD_BUG_ON from paravirt_getreg()

061afe9f

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus · 942c88cc

Linus Torvalds authored Dec 10, 2008

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: Better than nothing implementation of PCI mmap to fix X.

942c88cc

pktcdvd: remove broken dev_t export of class devices · cba76717

Kay Sievers authored Dec 06, 2008

The pktcdvd created class devices only export some sysfs files,
but have no char dev_t registered in the driver.

At class device creation time they copy the dev_t value of the
block device to the char device, wich will register a new char
device in the driver core and userspace, with a conflicting dev_t
value.

In many cases the class devices dev_t just points to a random
USB device. This fixes the sysfs "duplicate entry" errors.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cba76717

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 · cdcb30b5

Linus Torvalds authored Dec 10, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: fw-ohci: fix IOMMU resource exhaustion
  ieee1394: node manager causes up to ~3.25s delay in freezing tasks

cdcb30b5

drivers/video/mb862xx/mb862xxfb.c: fix printk · c1ab6cc6

Andrew Morton authored Dec 09, 2008

sparc64:

drivers/video/mb862xx/mb862xxfb.c:929: warning: long long unsigned int format, resource_size_t arg (arg 4)
drivers/video/mb862xx/mb862xxfb.c:931: warning: long long unsigned int format, resource_size_t arg (arg 4)

We don't know what type the architecture uses to implement u64, hence they
cannot be printed.

Cc: Anatolij Gustschin <agust@denx.de>
Cc: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Matteo Fortini <m.fortini@selcomgroup.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c1ab6cc6

KSYM_SYMBOL_LEN fixes · 9c246247

Hugh Dickins authored Dec 09, 2008

Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked
to my 966c8c12 sprint_symbol(): use
less stack exposing a bug in slub's list_locations() -
kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was
beyond the end of page provided.

The 100 slop which list_locations() allows at end of page looks roughly
enough for all the other stuff it might print after the symbol before
it checks again: break out KSYM_SYMBOL_LEN earlier than before.

Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they
need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer
where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies
them.

[akpm@linux-foundation.org: ftrace.h needs module.h]
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc Miles Lane <miles.lane@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9c246247

inotify: fix IN_ONESHOT unmount event watcher · 6ee5a399

Dmitri Monakhov authored Dec 09, 2008

On umount two event will be dispatched to watcher:

1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
    ->inotify_dev_queue_event(.., IN_IGNORED, ..)

But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event.  Which result in accessing invalid object later.  IMHO
it is not pure regression.  This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)

  commit ac74c00e
  Author: Ulisses Furquim <ulissesf@gmail.com>
  Date:   Fri Feb 8 04:18:16 2008 -0800
    inotify: fix check for one-shot watches before destroying them
    As the IN_ONESHOT bit is never set when an event is sent we must check it
    in the watch's mask and not in the event's mask.

TESTCASE:
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
./inotify mnt/d&
umount mnt ## << lockup or crash here

TESTSOURCE:
/* gcc -oinotify inotify.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>

int main(int argc, char **argv)
{
        char buf[1024];
        struct inotify_event *ie;
        char *p;
        int i;
        ssize_t l;

        p = argv[1];
        i = inotify_init();
        inotify_add_watch(i, p, ~0);

        l = read(i, buf, sizeof(buf));
        printf("read %d bytes\n", l);
        ie = (struct inotify_event *) buf;
        printf("event mask: %d\n", ie->mask);
	return 0;
}
Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Cc: Ulisses Furquim <ulissesf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

6ee5a399

atomic: fix a typo in atomic_long_xchg() · aa6f1479

Eric Dumazet authored Dec 09, 2008

atomic_long_xchg() is not correctly defined for 32bit arches.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

aa6f1479

mm: no get_user/put_user while holding mmap_sem in do_pages_stat? · 80bba129

Brice Goglin authored Dec 09, 2008

Since commit 2f007e74, do_pages_stat()
gets the page address from user-space and puts the corresponding status
back while holding the mmap_sem for read.  There is no need to hold
mmap_sem there while some page-faults may occur.

This patch adds a temporary address and status buffer so as to only
hold mmap_sem while working on these kernel buffers.  This is
implemented by extracting do_pages_stat_array() out of do_pages_stat().
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

80bba129

drivers/serial/s3c2440.c: fix typo in MODULE_LICENSE · 52b9582d

Balaji Rao authored Dec 09, 2008

Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

52b9582d