Commits · 23da64b4714812b66ecf010e7dfb3ed1bf2eda69 · linux / linux-davinci

15 Apr, 2009 35 commits

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · 23da64b4

Linus Torvalds authored Apr 15, 2009

* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (28 commits)
  cfq-iosched: add close cooperator code
  cfq-iosched: log responsible 'cfqq' in idle timer arm
  cfq-iosched: tweak kick logic a bit more
  cfq-iosched: no need to save interrupts in cfq_kick_queue()
  brd: fix cacheflushing
  brd: support barriers
  swap: Remove code handling bio_alloc failure with __GFP_WAIT
  gfs2: Remove code handling bio_alloc failure with __GFP_WAIT
  ext4: Remove code handling bio_alloc failure with __GFP_WAIT
  dio: Remove code handling bio_alloc failure with __GFP_WAIT
  block: Remove code handling bio_alloc failure with __GFP_WAIT
  bio: add documentation to bio_alloc()
  splice: add helpers for locking pipe inode
  splice: remove generic_file_splice_write_nolock()
  ocfs2: fix i_mutex locking in ocfs2_splice_to_file()
  splice: fix i_mutex locking in generic_splice_write()
  splice: remove i_mutex locking in splice_from_pipe()
  splice: split up __splice_from_pipe()
  block: fix SG_IO to return a proper error value
  cfq-iosched: don't delay queue kick for a merged request
  ...

23da64b4

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · a23c218b

Linus Torvalds authored Apr 15, 2009

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: pseries/dtl.c should include asm/firmware.h
  powerpc: Fix data-corrupting bug in __futex_atomic_op
  powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset()
  powerpc: Allow 256kB pages with SHMEM
  powerpc: Document new FSL I2C bindings and cleanup
  powerpc/mm: Fix compile warning
  powerpc/85xx: TQM8548: update defconfig
  powerpc/85xx: TQM8548: use proper phy-handles for enet2 and enet3
  powerpc/85xx: TQM85xx: correct address of LM75 I2C device nodes
  powerpc: Add support for early tlbilx opcode
  powerpc: Fix tlbilx opcode

a23c218b

acpi-cpufreq: fix 'smp_call_function_many()' confusion · ea34f43a

Linus Torvalds authored Apr 15, 2009

It turns out that 'smp_call_function_many()' doesn't work at all like
'smp_call_function_single()', and my change to Andrew's patch to use it
rather than a loop over all CPU's acpi-cpufreq doesn't work.

My bad.

'smp_call_function_many()' has two "features" (aka "documented bugs"):

 (a) it needs to be called with preemption disabled, because it uses
     smp_processor_id() without guarding the CPU lookup with 'get_cpu()'
     and 'put_cpu()' like the 'single' variant does.

 (b) even if the current CPU is part of the CPU mask, it won't do the
     call on that CPU.

Still, we're better off trying to use 'smp_call_function_many()' than
looping over CPU's, since it at least in theory allows us to use a
broadcast IPI and do it all in parallel.  So let's just work around the
silly semantic bugs in that function.
Reported-and-tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ea34f43a

cfq-iosched: add close cooperator code · a36e71f9

Jens Axboe authored Apr 15, 2009

If we have processes that are working in close proximity to each
other on disk, we don't want to idle wait. Instead allow the close
process to issue a request, getting better aggregate bandwidth.
The anticipatory scheduler has similar checks, noop and deadline do
not need it since they don't care about process <-> io mappings.

The code for CFQ is a little more involved though, since we split
request queues into per-process contexts.

This fixes a performance problem with eg dump(8), since it uses
several processes in some silly attempt to speed IO up. Even if
dump(8) isn't really a valid case (it should be fixed by using
CLONE_IO), there are other cases where we see close processes
and where idling ends up hurting performance.

Credit goes to Jeff Moyer <jmoyer@redhat.com> for writing the
initial implementation.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

a36e71f9

cfq-iosched: log responsible 'cfqq' in idle timer arm · 9481ffdc
Jens Axboe authored Apr 15, 2009
```
Makes it easier to read the traces.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
```
9481ffdc

cfq-iosched: tweak kick logic a bit more · 2d870722

Jens Axboe authored Apr 15, 2009

We only kick the dispatch for an idling queue, if we think it's a
(somewhat) fully merged request. Also allow a kick if we have other
busy queues in the system, since we don't want to risk waiting for
a potential merge in that case. It's better to get some work done and
proceed.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

2d870722

cfq-iosched: no need to save interrupts in cfq_kick_queue() · 40bb54d1

Jens Axboe authored Apr 15, 2009

It's called from the workqueue handlers from process context, so
we always have irqs enabled when entered.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

40bb54d1

brd: fix cacheflushing · c2572f2b

Nick Piggin authored Apr 15, 2009

brd is missing a flush_dcache_page. On 2nd thoughts, perhaps it is the
pagecache's responsibility to flush user virtual aliases (the driver of
course should flush kernel virtual mappings)... but anyway, there
already exists cache flushing for one direction of transfer, so we
should add the other.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

c2572f2b

brd: support barriers · dfbc4752

Nick Piggin authored Apr 15, 2009

brd is always ordered (not that it matters, as it is defined not to
survive when the system goes down). So tell the block layer it is
ordered, which might be of help with testing filesystems.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

dfbc4752

swap: Remove code handling bio_alloc failure with __GFP_WAIT · 297dbf50

Nikanth Karthikesan authored Apr 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

297dbf50

gfs2: Remove code handling bio_alloc failure with __GFP_WAIT · b1fffc9c

Nikanth Karthikesan authored Apr 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_NOFS implies __GFP_WAIT.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

b1fffc9c

ext4: Remove code handling bio_alloc failure with __GFP_WAIT · 226e7dab

Nikanth Karthikesan authored Apr 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_NOIO implies __GFP_WAIT.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

226e7dab

dio: Remove code handling bio_alloc failure with __GFP_WAIT · 4d1f9fdb

Nikanth Karthikesan authored Apr 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_KERNEL implies __GFP_WAIT.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

4d1f9fdb

block: Remove code handling bio_alloc failure with __GFP_WAIT · 15afd1cc

Nikanth Karthikesan authored Apr 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_KERNEL implies __GFP_WAIT.
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

15afd1cc

bio: add documentation to bio_alloc() · 86c824b9

Jens Axboe authored Apr 15, 2009

Explain that with __GFP_WAIT set it will not fail, and that the caller
must never allocate more than 1 bio at the time.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

86c824b9

splice: add helpers for locking pipe inode · 61e0d47c

Miklos Szeredi authored Apr 14, 2009

There are lots of sequences like this, especially in splice code:

	if (pipe->inode)
		mutex_lock(&pipe->inode->i_mutex);
	/* do something */
	if (pipe->inode)
		mutex_unlock(&pipe->inode->i_mutex);

so introduce helpers which do the conditional locking and unlocking.
Also replace the inode_double_lock() call with a pipe_double_lock()
helper to avoid spreading the use of this functionality beyond the
pipe code.

This patch is just a cleanup, and should cause no behavioral changes.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

61e0d47c

splice: remove generic_file_splice_write_nolock() · f8cc774c

Miklos Szeredi authored Apr 14, 2009

Remove the now unused generic_file_splice_write_nolock() function.
It's conceptually broken anyway, because splice may need to wait for
pipe events so holding locks across the whole operation is wrong.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

f8cc774c

ocfs2: fix i_mutex locking in ocfs2_splice_to_file() · 328eaaba

Miklos Szeredi authored Apr 14, 2009

Rearrange locking of i_mutex on destination and call to
ocfs2_rw_lock() so locks are only held while buffers are copied with
the pipe_to_file() actor, and not while waiting for more data on the
pipe.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

328eaaba

splice: fix i_mutex locking in generic_splice_write() · eb443e5a

Miklos Szeredi authored Apr 14, 2009

Rearrange locking of i_mutex on destination so it's only held while
buffers are copied with the pipe_to_file() actor, and not while
waiting for more data on the pipe.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

eb443e5a

splice: remove i_mutex locking in splice_from_pipe() · 2933970b

Miklos Szeredi authored Apr 14, 2009

splice_from_pipe() is only called from two places:

  - generic_splice_sendpage()
  - splice_write_null()

Neither of these require i_mutex to be taken on the destination inode.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

2933970b

splice: split up __splice_from_pipe() · b3c2d2dd

Miklos Szeredi authored Apr 14, 2009

Split up __splice_from_pipe() into four helper functions:

  splice_from_pipe_begin()
  splice_from_pipe_next()
  splice_from_pipe_feed()
  splice_from_pipe_end()

splice_from_pipe_next() will wait (if necessary) for more buffers to
be added to the pipe.  splice_from_pipe_feed() will feed the buffers
to the supplied actor and return when there's no more data available
(or if all of the requested data has been copied).

This is necessary so that implementations can do locking around the
non-waiting splice_from_pipe_feed().

This patch should not cause any change in behavior.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

b3c2d2dd

block: fix SG_IO to return a proper error value · 91e463c8

FUJITA Tomonori authored Apr 13, 2009

blk_rq_unmap_user() returns -EFAULT if a program passes an invalid
address to kernel. SG_IO path needs to pass the returned value to user
space instead of ignoring it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

91e463c8

cfq-iosched: don't delay queue kick for a merged request · d6ceb25e

Jens Axboe authored Apr 14, 2009

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> reports that commit
b029195d introduced a regression
of about 50% with sequential threaded read workloads. The test
case is:

tiotest -k0 -k1 -k3 -f 80 -t 32

which starts 32 threads each reading a 80MB file. Twiddle the kick
queue logic so that we do start IO immediately, if it appears to be
a fully merged request. We can't really detect that, so just check
if the request is bigger than a page or not. The assumption is that
since single bio issues will first queue a single request with just
one page attached and then later do merges on that, if we already
have more than a page worth of data in the request, then the request
is most likely good to go.

Verified that this doesn't cause a regression with the test case that
commit b029195d was fixing. It does not,
we still see maximum sized requests for the queue-then-merge cases.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

d6ceb25e

buffer: switch do_emergency_thaw() away from pdflush_operation() · 053c525f

Jens Axboe authored Apr 08, 2009

This is (again) a preparatory patch similar to commit
a2a9537a. It open codes a simple
async way of executing do_thaw_all() out of context, so we can get
rid of pdflush.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

053c525f

block: update biodoc.txt on plugging · 329007ce

Jens Axboe authored Apr 08, 2009

We do per-device plugging, get rid of any references to tq_disk as that
has been dead since 2.6.5 or so.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

329007ce

as-iosched: get rid of private REQ_SYNC/REQ_ASYNC defines · 1d6bfbdf

Jens Axboe authored Apr 08, 2009

We can just use the block layer BLK_RW_SYNC/ASYNC defines now.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

1d6bfbdf

cfq-iosched: get rid of private SYNC/ASYNC defines · ff6657c6

Jens Axboe authored Apr 08, 2009

We can just use the block layer BLK_RW_SYNC/ASYNC defines now.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

ff6657c6

cfq-iosched: use rw_is_sync() to see if rw flags are sync or not · b0b78f81
Jens Axboe authored Apr 08, 2009
```
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
```
b0b78f81

Document and move the various READ/WRITE types · 48e70bc1

Jens Axboe authored Apr 14, 2009

It's a somewhat twisty maze of hints and behavioural modifiers, try
and clear it up a bit with some documentation.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

48e70bc1

block: fix bad spelling of quiesce · f600abe2

Jens Axboe authored Apr 08, 2009

Credit goes to Andrew Morton for spotting this one.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

f600abe2

block: move bio list helpers into bio.h · 8f3d8ba2

Christoph Hellwig authored Apr 07, 2009

It's used by DM and MD and generally useful, so move the bio list
helpers into bio.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

8f3d8ba2

powerpc: pseries/dtl.c should include asm/firmware.h · b71a0c29

Sachin Sant authored Apr 14, 2009

A randconfig build on powerpc failed with:

dtl.c: In function 'dtl_init':
dtl.c:238: error: implicit declaration of function 'firmware_has_feature'
dtl.c:238: error: 'FW_FEATURE_SPLPAR' undeclared (first use in this function)

- We need firmware.h for these definitions.
Signed-off-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

b71a0c29

powerpc: Fix data-corrupting bug in __futex_atomic_op · 306a8288

Paul Mackerras authored Apr 13, 2009

Richard Henderson pointed out that the powerpc __futex_atomic_op has a
bug: it will write the wrong value if the stwcx. fails and it has to
retry the lwarx/stwcx. loop, since 'oparg' will have been overwritten
by the result from the first time around the loop.  This happens
because it uses the same register for 'oparg' (an input) as it uses
for the result.

This fixes it by using separate registers for 'oparg' and 'ret'.

Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>

306a8288

powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset() · c58dc575

Mike Mason authored Apr 10, 2009

While adding native EEH support to Emulex and Qlogic drivers, it was
discovered that dev->error_state was set to pci_io_channel_normal too
late in the recovery process. These drivers rely on error_state to
determine if they can access the device in their slot_reset callback,
thus error_state needs to be set to pci_io_channel_normal in
eeh_report_reset(). Below is a detailed explanation (courtesy of Richard
Lary) as to why this is necessary.

Background:
PCI MMIO or DMA accesses to a frozen slot generate additional EEH
errors. If the number of additional EEH errors exceeds EEH_MAX_FAILS the
adapter will be shutdown. To avoid triggering excessive EEH errors and
an undesirable adapter shutdown, some drivers use the
pci_channel_offline(dev) wrapper function to return a Boolean value
based on the value of pci_dev->error_state to determine if PCI MMIO or
DMA accesses are safe. If the wrapper returns TRUE, drivers must not
make PCI MMIO or DMA access to their hardware.

The pci_dev structure member error_state reflects one of three values,
1) pci_channel_io_normal, 2) pci_channel_io_frozen, 3)
pci_channel_io_perm_failure.  Function pci_channel_offline(dev) returns
TRUE if error_state is pci_channel_io_frozen or pci_channel_io_perm_failure.

The EEH driver sets pci_dev->error_state to pci_channel_io_frozen at the
point where the PCI slot is frozen. Currently, the EEH driver restores
dev->error_state to pci_channel_io_normal in eeh_report_resume() before
calling the driver's resume callback. However, when the EEH driver calls
the driver's slot_reset callback() from eeh_report_reset(), it
incorrectly indicates the error state is still pci_channel_io_frozen.

Waiting until eeh_report_resume() to restore dev->error_state to
pci_channel_io_normal is too late for Emulex and QLogic FC drivers and
any other drivers which are designed to use common code paths in these
two cases: i) those called after the driver's slot_reset callback() and
ii) those called after the PCI slot is frozen but before the driver's
slot_reset callback is called. Case i) all driver paths executed to
reinitialize the hardware after a reset and case ii) all code paths
executed by driver kernel threads that run asynchronous to the main
driver thread, such as interrupt handlers and worker threads to process
driver work queues.

Emulex and QLogic FC drivers are designed with common code paths which
require that pci_channel_offline(dev) reflect the true state of the
hardware. The state transitions that the hardware takes from Normal
Operations to Slot Frozen to Reset to Normal Operations are documented
in the Power Architecture™ Platform Requirements+ (PAPR+) in Table 75.
PE State Control.

PAPR defines the following 3 states:

0 -- Not reset, Not EEH stopped, MMIO load/store allowed, DMA allowed
     (Normal Operations)
1 -- Reset, Not EEH stopped, MMIO load/store disabled, DMA disabled
2 -- Not reset, EEH stopped, MMIO load/store disabled, DMA disabled
     (Slot Frozen)

An EEH error places the slot in state 2 (Frozen) and the adapter driver
is notified that an EEH error was detected. If the adapter driver
returns PCI_ERS_RESULT_NEED_RESET, the EEH driver calls
eeh_reset_device() to place the slot into state 1 (Reset) and
eeh_reset_device completes by placing the slot into State 0 (Normal
Operations). Upon return from eeh_reset_device(), the EEH driver calls
eeh_report_reset, which then calls the adapter's slot_reset callback. At
the time the adapter's slot_reset callback is called, the true state of
the hardware is Normal Operations and should be accurately reflected by
setting dev->error_state to pci_channel_io_normal.

The current implementation of EEH driver does not do so and requires
this change to correct this deficiency.
Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Acked-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

c58dc575

powerpc: Allow 256kB pages with SHMEM · adf213c4

Hugh Dickins authored Apr 06, 2009

Now that shmem's divisions by zero and SHMEM_MAX_BYTES are fixed,
let powerpc 256kB pages coexist with CONFIG_SHMEM again.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

adf213c4

14 Apr, 2009 5 commits

Linux 2.6.30-rc2 · 0882e8dd
Linus Torvalds authored Apr 14, 2009

0882e8dd

Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel · b897e6fb

Linus Torvalds authored Apr 14, 2009

* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
  drm/i915: fix scheduling while holding the new active list spinlock
  drm/i915: Allow tiling of objects with bit 17 swizzling by the CPU.
  drm/i915: Correctly set the write flag for get_user_pages in pread.
  drm/i915: Fix use of uninitialized var in 40a5f0de
  drm/i915: indicate framebuffer restore key in SysRq help message
  drm/i915: sync hdmi detection by hdmi identifier with 2D
  drm/i915: Fix a mismerge of the IGD patch (new .find_pll hooks missed)
  drm/i915: Implement batch and ring buffer dumping

b897e6fb

x86 microcode: revert some work_on_cpu · 6f66cbc6

Hugh Dickins authored Apr 14, 2009

Revert part of af5c820a ("x86: cpumask:
use work_on_cpu in arch/x86/kernel/microcode_core.c")

That change is causing only one Intel CPU's microcode to be updated e.g.
microcode: CPU3 updated from revision 0x9 to 0x17, date = 2005-04-22
where before it announced that also for CPU0 and CPU1 and CPU2.

We cannot use work_on_cpu() in the CONFIG_MICROCODE_OLD_INTERFACE code,
because Intel's request_microcode_user() involves a copy_from_user() from
/sbin/microcode_ctl, which therefore needs to be on that CPU at the time.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

6f66cbc6

drm/i915: fix scheduling while holding the new active list spinlock · 68c84342

Shaohua Li authored Apr 08, 2009

regression caused by commit 5e118f41:
i915_gem_object_move_to_inactive() should be called in task context,
as it calls fput();

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
[anholt: Add more detail to the comment about the lock break that's added]
Signed-off-by: Eric Anholt <eric@anholt.net>

68c84342

Merge branch 'core-fixes-for-linus' of... · 610f26e7

Linus Torvalds authored Apr 14, 2009

Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: warn about lockdep disabling after kernel taint, fix

610f26e7