Commits · af37501c792107c2bde1524bdae38d9a247b841a · linux / linux-davinci

18 Jan, 2009 14 commits

Merge branch 'core/percpu' into perfcounters/core · af37501c

Ingo Molnar authored Jan 18, 2009

Conflicts:
	arch/x86/include/asm/pda.h

We merge tip/core/percpu into tip/perfcounters/core because of a
semantic and contextual conflict: the former eliminates the PDA,
while the latter extends it with apic_perf_irqs field.

Resolve the conflict by moving the new field to the irq_cpustat
structure on 64-bit too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

af37501c

Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu · 99937d64
Ingo Molnar authored Jan 18, 2009

99937d64

x86-64: Use absolute displacements for per-cpu accesses. · 87b26406

Brian Gerst authored Jan 19, 2009

Accessing memory through %gs should not use rip-relative addressing.
Adding a P prefix for the argument tells gcc to not add (%rip) to
the memory references.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

87b26406

x86-64: Move isidle from PDA to per-cpu. · c2558e0e

Brian Gerst authored Jan 19, 2009

tj: s/isidle/is_idle/
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

c2558e0e

x86-64: Move nodenumber from PDA to per-cpu. · e7a22c1e

Brian Gerst authored Jan 19, 2009

tj: * s/nodenumber/node_number/
    * removed now unused pda variable from pda_init()
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

e7a22c1e

x86-64: Move irqcount from PDA to per-cpu. · 56895530

Brian Gerst authored Jan 19, 2009

tj: s/irqcount/irq_count/
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

56895530

x86-64: Move oldrsp from PDA to per-cpu. · 3d1e42a7

Brian Gerst authored Jan 19, 2009

tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda
      is still referenced in the file
    * s/oldrsp/old_rsp/
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

3d1e42a7

x86-64: Move kernelstack from PDA to per-cpu. · 9af45651

Brian Gerst authored Jan 19, 2009

Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

9af45651

x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit. · c6f5e0ac
Brian Gerst authored Jan 19, 2009
```
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
```
c6f5e0ac

x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit. · ea927906

Brian Gerst authored Jan 19, 2009

tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

ea927906

x86-64: Convert exception stacks to per-cpu · 92d65b23

Brian Gerst authored Jan 19, 2009

Move the exception stacks to per-cpu, removing specific allocation code.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

92d65b23

x86-64: Convert irqstacks to per-cpu · 26f80bd6

Brian Gerst authored Jan 19, 2009

Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

26f80bd6

x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit. · 9eb912d1
Brian Gerst authored Jan 19, 2009
```
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
```
9eb912d1
x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit. · 1b437c8c
Brian Gerst authored Jan 19, 2009
```
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
```
1b437c8c

17 Jan, 2009 3 commits

perf_counter: Add counter enable/disable ioctls · d859e29f

Paul Mackerras authored Jan 17, 2009

Impact: New perf_counter features

This primarily adds a way for perf_counter users to enable and disable
counters and groups. Enabling or disabling a counter or group also
enables or disables all of the child counters that have been cloned
from it to monitor children of the task monitored by the top-level
counter. The userspace interface to enable/disable counters is via
ioctl on the counter file descriptor.

Along the way this extends the code that handles child counters to
handle child counter groups properly. A group with multiple counters
will be cloned to child tasks if and only if the group leader has the
hw_event.inherit bit set - if it is set the whole group is cloned as a
group in the child task.

In order to be able to enable or disable all child counters of a given
top-level counter, we need a way to find them all. Hence I have added
a child_list field to struct perf_counter, which is the head of the
list of children for a top-level counter, or the link in that list for
a child counter. That list is protected by the perf_counter.mutex
field.

This also adds a mutex to the perf_counter_context struct. Previously
the list of counters was protected just by the lock field in the
context, which meant that perf_counter_init_task had to take that lock
and then take whatever lock/mutex protects the top-level counter's
child_list. But the counter enable/disable functions need to take
that lock in order to traverse the list, then for each counter take
the lock in that counter's context in order to change the counter's
state safely, which will lead to a deadlock.

To solve this, we now have both a mutex and a spinlock in the context,
and taking either is sufficient to ensure the list of counters can't
change - you have to take both before changing the list. Now
perf_counter_init_task takes the mutex instead of the lock (which
incidentally means that inherit_counter can use GFP_KERNEL instead of
GFP_ATOMIC) and thus avoids the possible deadlock. Similarly the new
enable/disable functions can take the mutex while traversing the list
of child counters without incurring a possible deadlock when the
counter manipulation code locks the context for a child counter.

We also had an misfeature that the first counter added to a context
would possibly not go on until the next sched-in, because we were
using ctx->nr_active to detect if the context was running on a CPU.
But nr_active is the number of active counters, and if that was zero
(because the context didn't have any counters yet) it would look like
the context wasn't running on a cpu and so the retry code in
__perf_install_in_context wouldn't retry. So this adds an 'is_active'
field that is set when the context is on a CPU, even if it has no
counters. The is_active field is only used for task contexts, not for
per-cpu contexts.

If we enable a subsidiary counter in a group that is active on a CPU,
and the arch code can't enable the counter, then we have to pull the
whole group off the CPU. We do this with group_sched_out, which gets
moved up in the file so it comes before all its callers. This also
adds similar logic to __perf_install_in_context so that the "all on,
or none" invariant of groups is preserved when adding a new counter to
a group.
Signed-off-by: Paul Mackerras <paulus@samba.org>

d859e29f

linker script: add missing .data.percpu.page_aligned · 74e79045

Tejun Heo authored Jan 17, 2009

arm, arm/mach-integrator and powerpc were missing
.data.percpu.page_aligned in their percpu output section definitions.
Add it.
Signed-off-by: Tejun Heo <tj@kernel.org>

74e79045

linker script: add missing VMLINUX_SYMBOL · 145cd30b

Tejun Heo authored Jan 17, 2009

The newly added PERCPU_*() macros define and use __per_cpu_load but
VMLINUX_SYMBOL() was missing from usages causing build failures on
archs where linker visible symbol is different from C symbols
(e.g. blackfin).
Signed-off-by: Tejun Heo <tj@kernel.org>

145cd30b

16 Jan, 2009 15 commits

x86_64: initialize this_cpu_off to __per_cpu_load · cd3adf52

Tejun Heo authored Jan 16, 2009

On x86_64, if get_per_cpu_var() is used before per cpu area is setup
(if lockdep is turned on, it happens), it needs this_cpu_off to point
to __per_cpu_load.  Initialize accordingly.
Signed-off-by: Tejun Heo <tj@kernel.org>

cd3adf52

x86: fix build bug introduced during merge · a338af2c

Tejun Heo authored Jan 16, 2009

EXPORT_PER_CPU_SYMBOL() got misplaced during merge leading to build
failure.  Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>

a338af2c

percpu: add optimized generic percpu accessors · 6dbde353

Ingo Molnar authored Jan 15, 2009

It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Tejun Heo <tj@kernel.org>

6dbde353

x86: misc clean up after the percpu update · 004aa322

Tejun Heo authored Jan 13, 2009

Do the following cleanups:

* kill x86_64_init_pda() which now is equivalent to pda_init()

* use per_cpu_offset() instead of cpu_pda() when initializing
  initial_gs
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

004aa322

x86: convert pda ops to wrappers around x86 percpu accessors · 49357d19

Tejun Heo authored Jan 13, 2009

pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors.  Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

49357d19

x86: make pda a percpu variable · b12d8db8

Tejun Heo authored Jan 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

b12d8db8

x86: merge 64 and 32 SMP percpu handling · 9939ddaf

Tejun Heo authored Jan 13, 2009

Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda.  Unify x86_64 SMP percpu access with x86_32 SMP
one.  Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place.  This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas().  Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

9939ddaf

x86: fold pda into percpu area on SMP · 1a51e3a0

Tejun Heo authored Jan 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

1a51e3a0

x86: use static _cpu_pda array · c8f3329a

Tejun Heo authored Jan 13, 2009

_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

c8f3329a

x86: load pointer to pda into %gs while brining up a CPU · f32ff538

Tejun Heo authored Jan 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.
Signed-off-by: Tejun Heo <tj@kernel.org>

f32ff538

x86: make percpu symbols zerobased on SMP · 3e5d8f97

Tejun Heo authored Jan 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S.  A new
PHDR is added as existing ones cannot contain sections near address
zero.  PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.

The following adjustments have been made to accomodate the address
change.

* code to locate percpu gdt_page in head_64.S is updated to add the
  load address to the gdt_page offset.

* __per_cpu_load is used in places where access to the init data area
  is necessary.

* pda->data_offset is initialized soon after C code is entered as zero
  value doesn't work anymore.

This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

3e5d8f97

x86: make vmlinux_32.lds.S use PERCPU() macro · a698c823

Tejun Heo authored Jan 13, 2009

Make vmlinux_32.lds.S use the generic PERCPU() macro instead of open
coding it.  This will ease future changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

a698c823

x86: cleanup early setup_percpu references · c90aa894

Mike Travis authored Jan 13, 2009

[ Based on original patch from Christoph Lameter and Mike Travis. ]

  * Ruggedize some calls in setup_percpu.c to prevent mishaps
    in early calls, particularly for non-critical functions.

  * Cleanup DEBUG_PER_CPU_MAPS usages and some comments.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

c90aa894

x86: make early_per_cpu() a lvalue and use it · f10fcd47

Tejun Heo authored Jan 13, 2009

Make early_per_cpu() a lvalue as per_cpu() is and use it where
applicable.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

f10fcd47

x86: fix pda_to_op() · 7de6883f

Tejun Heo authored Jan 13, 2009

There's no instruction to move a 64bit immediate into memory location.
Drop "i".
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

7de6883f

15 Jan, 2009 5 commits

Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu · 7f268f43
Ingo Molnar authored Jan 15, 2009

7f268f43
x86: fix broken flush_tlb_others_ipi(), fix · 54da5b3d
Ingo Molnar authored Jan 15, 2009
```
Impact: cleanup

Use the proper type.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
```
54da5b3d

x86: avoid early crash in disable_local_APIC() · a08c4743

Jan Beulich authored Jan 14, 2009

E.g. when called due to an early panic.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

a08c4743

x86: fully honor "nolapic" · f1182638

Jan Beulich authored Jan 14, 2009

Impact: widen the effect of the 'nolapic' boot parameter

"nolapic" should not only suppress SMP and use of the LAPIC, but it
also ought to have the effect of disabling all IO-APIC related activity
as well as PCI MSI and HT-IRQs.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

f1182638

irq: update all arches for new irq_desc, fix · d2287f5e

Mike Travis authored Jan 14, 2009

Impact: fix build errors

Since the SPARSE IRQS changes redefined how the kstat irqs are
organized, arch's must use the new accessor function:

	kstat_incr_irqs_this_cpu(irq, DESC);

If CONFIG_SPARSE_IRQS is set, then DESC is a pointer to the
irq_desc which has a pointer to the kstat_irqs.  If not, then
the .irqs field of struct kernel_stat is used instead.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

d2287f5e

14 Jan, 2009 3 commits

x86, pat: fix reserve_memtype() for legacy 1MB range · 5cca0cf1

Suresh Siddha authored Jan 09, 2009

Thierry Vignaud reported:
> http://bugzilla.kernel.org/show_bug.cgi?id=12372
>
> On P4 with an SiS motherboard (video card is a SiS 651)
> X server fails to start with error:
> xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid
> argument)

Here X is trying to map first 8KB of memory using /dev/mem. Existing
code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent
code changes don't allow to map memory with different attributes
at the same time.

Fix this by treating the first 1MB legacy region as special and always
track the attribute requests with in this region using linear linked
list (and don't bother if the range is RAM or non-RAM or mixed)
Reported-and-tested-by: Thierry Vignaud <tvignaud@mandriva.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

5cca0cf1

x86: make 32bit MAX_HARDIRQS_PER_CPU to be NR_VECTORS · b6659679

Yinghai Lu authored Jan 12, 2009

Impact: clean up to be same as 64bit

32-bit is using per-cpu vector too, so don't use default NR_IRQS.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

b6659679

Merge branch 'master' of... · e46d5178

Ingo Molnar authored Jan 14, 2009

Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096

e46d5178