Commits · 98517d8f092cceea77ab93f15437c182b5615d9e · linux / linux-davinci

04 Aug, 2009 6 commits

kgdb: continue and warn on signal passing from gdb · 98517d8f

Jason Wessel authored Apr 27, 2009

On some architectures for the segv trap, gdb wants to pass the signal
back on continue.  For kgdb this is not the default behavior, because
it can cause the kernel to crash if you arbitrarily pass back a
exception outside of kgdb.

Instead of causing instability, pass a message back to gdb about the
supported kgdb signal passing and execute a standard kgdb continue
operation.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

98517d8f

kgdb: allow for cpu switch when single stepping · 96195aaf

Jason Wessel authored Aug 04, 2009

The kgdb core should not assume that a single step operation of a
kernel thread will complete on the same CPU.  The single step flag is
set at the "thread" level and it is possible in a multi cpu system
that a kernel thread can get scheduled on another cpu the next time it
run.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

96195aaf

kgdb,i386: Fix corner case access to sp with NMI watch dog exception · 1cde8881

Jason Wessel authored May 15, 2009

It is possible for the user_mode_vm(regs) check to return true for a
non master kgdb cpu or when the master kgdb cpu handles the NMI watch
dog exception.

The solution is simply to select the correct stack pointer location
based on the check to user_mode_vm(regs).
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

1cde8881

softlockup: add sched_clock_tick() to avoid kernel warning on kgdb resume · b460e199

Jason Wessel authored Aug 04, 2009

When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set sched_clock() gets the
time from hardware, such as from TSC.  In this configuration kgdb will
report a softlock warning messages on resuming or detaching from a
debug session.

Sequence of events in the problem case:

1) "cpu sched clock" and "hardware time" are at 100 sec prior
   to a call to kgdb_handle_exception()

2) Debugger waits in kgdb_handle_exception() for 80 sec and on exit
   the following is called ...  touch_softlockup_watchdog() -->
   __raw_get_cpu_var(touch_timestamp) = 0;

3) "cpu sched clock" = 100s (it was not updated, because the interrupt
   was disabled in kgdb) but the "hardware time" = 180 sec

4) The first timer interrupt after resuming from kgdb_handle_exception
   updates the watchdog from the "cpu sched clock"

update_process_times() { ...  run_local_timers() --> softlockup_tick()
--> check (touch_timestamp == 0) (it is "YES" here, we have set
"touch_timestamp = 0" at kgdb) --> __touch_softlockup_watchdog()
***(A)--> reset "touch_timestamp" to "get_timestamp()" (Here, the
"touch_timestamp" will still be set to 100s.)  ...

    scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched
    clock" to "hardware time" = 180s) ...  }

5) The Second timer interrupt handler appears to have a large jump and
   trips the softlockup warning.

update_process_times() { ...  run_local_timers() --> softlockup_tick()
--> "cpu sched clock" - "touch_timestamp" = 180s-100s > 60s --> printk
"soft lockup error messages" ...  }

note: ***(A) reset "touch_timestamp" to "get_timestamp(this_cpu)"

Why "touch_timestamp" is 100 sec, instead of 180 sec?

With the CONFIG_HAVE_UNSTABLE_SCHED_CLOCK" set the call trace of
get_timestamp() is:

get_timestamp(this_cpu) -->cpu_clock(this_cpu)
-->sched_clock_cpu(this_cpu) -->__update_sched_clock(sched_clock_data,
now)

The __update_sched_clock() function uses the GTOD tick value to create
a window to normalize the "now" values.  So if "now" values is too big
for sched_clock_data, it will be ignored.

The fix is to invoke sched_clock_tick() to update "cpu sched clock" in
order to recover from this state.  This is done by introducing the
function touch_softlockup_watchdog_sync(), which allows kgdb to
request that the sched clock is updated when the watchdog thread runs
the first time after a resume from kgdb.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Dongdong Deng <Dongdong.Deng@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: peterz@infradead.org

b460e199

kgdb: Read buffer overflow · 0529da0e

Jason Wessel authored Aug 04, 2009

Roel Kluin reported an error found with Parfait.  Where we want to
ensure that that kgdb_info[-1] never gets accessed.

Also check to ensure any negative tid does not exceed the size of the
shadow CPU arrary, else report critical debug context because it is an
internal kgdb failure.
Reported-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

0529da0e

x86 kgdb: remove redundant test · 0b7f56b6

Roel Kluin authored Aug 04, 2009

The for loop starts with a breakno of 0, and ends when it's 4. so
this test is always true.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

0b7f56b6

03 Aug, 2009 8 commits

Merge branch 'for-linus' of git://neil.brown.name/md · a33a052f

Linus Torvalds authored Aug 02, 2009

* 'for-linus' of git://neil.brown.name/md:
  md: Use revalidate_disk to effect changes in size of device.
  md: allow raid5_quiesce to work properly when reshape is happening.
  md/raid5: set reshape_position correctly when reshape starts.
  md: Handle growth of v1.x metadata correctly.
  md: avoid array overflow with bad v1.x metadata
  md: when a level change reduces the number of devices, remove the excess.
  md: Push down data integrity code to personalities.
  md/raid6: release spare page at ->stop()

a33a052f

md: Use revalidate_disk to effect changes in size of device. · 449aad3e

NeilBrown authored Aug 03, 2009

As revalidate_disk calls check_disk_size_change, it will cause
any capacity change of a gendisk to be propagated to the blockdev
inode.  So use that instead of mucking about with locks and
i_size_write.

Also add a call to revalidate_disk in do_md_run and a few other places
where the gendisk capacity is changed.
Signed-off-by: NeilBrown <neilb@suse.de>

449aad3e

md: allow raid5_quiesce to work properly when reshape is happening. · 64bd660b

NeilBrown authored Aug 03, 2009

The ->quiesce method is not supposed to stop resync/recovery/reshape,
just normal IO.
But in raid5 we don't have a way to know which stripes are being
used for normal IO and which for resync etc, so we need to wait for
all stripes to be idle to be sure that all writes have completed.

However reshape keeps at least some stripe busy for an extended period
of time, so a call to raid5_quiesce can block for several seconds
needlessly.
So arrange for reshape etc to pause briefly while raid5_quiesce is
trying to quiesce the array so that the active_stripes count can
drop to zero.
Signed-off-by: NeilBrown <neilb@suse.de>

64bd660b

md/raid5: set reshape_position correctly when reshape starts. · e516402c

NeilBrown authored Aug 03, 2009

As the internal reshape_progress counter is the main driver
for reshape, the fact that reshape_position sometimes starts with the
wrong value has minimal effect.  It is visible in sysfs and that
is all.
Signed-off-by: NeilBrown <neilb@suse.de>

e516402c

md: Handle growth of v1.x metadata correctly. · 70471daf

NeilBrown authored Aug 03, 2009

The v1.x metadata does not have a fixed size and can grow
when devices are added.
If it grows enough to require an extra sector of storage,
we need to update the 'sb_size' to match.

Without this, md can write out an incomplete superblock with a
bad checksum, which will be rejected when trying to re-assemble
the array.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

70471daf

md: avoid array overflow with bad v1.x metadata · 3673f305

NeilBrown authored Aug 03, 2009

We trust the 'desc_nr' field in v1.x metadata enough to use it
as an index in an array.  This isn't really safe.
So range-check the value first.
Signed-off-by: NeilBrown <neilb@suse.de>

3673f305

md: when a level change reduces the number of devices, remove the excess. · 3a981b03

NeilBrown authored Aug 03, 2009

When an array is changed from RAID6 to RAID5, fewer drives are
needed.  So any device that is made superfluous by the level
conversion must be marked as not-active.
For the RAID6->RAID5 conversion, this will be a drive which only
has 'Q' blocks on it.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

3a981b03

md: Push down data integrity code to personalities. · ac5e7113

Andre Noll authored Aug 03, 2009

This patch replaces md_integrity_check() by two new public functions:
md_integrity_register() and md_integrity_add_rdev() which are both
personality-independent.

md_integrity_register() is called from the ->run and ->hot_remove
methods of all personalities that support data integrity.  The
function iterates over the component devices of the array and
determines if all active devices are integrity capable and if their
profiles match. If this is the case, the common profile is registered
for the mddev via blk_integrity_register().

The second new function, md_integrity_add_rdev() is called from the
->hot_add_disk methods, i.e. whenever a new device is being added
to a raid array. If the new device does not support data integrity,
or has a profile different from the one already registered, data
integrity for the mddev is disabled.

For raid0 and linear, only the call to md_integrity_register() from
the ->run method is necessary.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>

ac5e7113

02 Aug, 2009 21 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog · 4905f92e
Linus Torvalds authored Aug 02, 2009
```
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] Fix COH 901 327 watchdog enablement
```
4905f92e

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 · 0ce166b7

Linus Torvalds authored Aug 02, 2009

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  eeepc-laptop: fix hot-unplug on resume
  ACPI: Ingore the memory block with zero block size in course of memory hotplug
  ACPI: Don't treat generic error as ACPI error code in acpi memory hotplug driver
  ACPI: bind workqueues to CPU 0 to avoid SMI corruption
  ACPI: root-only read protection on /sys/firmware/acpi/tables/*
  thinkpad-acpi: fix incorrect use of TPACPI_BRGHT_MODE_ECNVRAM
  thinkpad-acpi: restrict procfs count value to sane upper limit
  thinkpad-acpi: remove dock and bay subdrivers
  thinkpad-acpi: disable broken bay and dock subdrivers
  hp-wmi: check that an input device exists in resume handler
  Revert "ACPICA: Remove obsolete acpi_os_validate_address interface"

0ce166b7

TTY: Maintainer change · 57d7f282

Greg Kroah-Hartman authored Jul 31, 2009

Clearly, I am a glutton for punishment.  I'll see if I can see Alan's
changes through to the end, otherwise I'll be fending off a lot of bug
reports for usb-serial devices.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

57d7f282

Make pci_claim_resource() use request_resource() rather than insert_resource() · 79896cf4

Linus Torvalds authored Aug 02, 2009

This function has traditionally used "insert_resource()", because before
commit cebd78a8 ("Fix pci_claim_resource") it used to just insert the
resource into whatever root resource tree that was indicated by
"pcibios_select_root()".

So there Matthew fixed it to actually look up the proper parent
resource, which means that now it's actively wrong to then traverse the
resource tree any more: we already know exactly where the new resource
should go.

And when we then did commit a76117df ("x86: Use pci_claim_resource"),
which changed the x86 PCI code from the open-coded

	pr = pci_find_parent_resource(dev, r);
	if (!pr || request_resource(pr, r) < 0) {

to using

	if (pci_claim_resource(dev, idx) < 0) {

that "insert_resource()" now suddenly became a problem, and causes a
regression covered by

	http://bugzilla.kernel.org/show_bug.cgi?id=13891

which this fixes.
Reported-and-tested-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

79896cf4

[WATCHDOG] Fix COH 901 327 watchdog enablement · 5973bee4

Linus Walleij authored Jul 21, 2009

Since the COH 901 327 found in U300 is clocked at 32 kHz we need
to wait for the interrupt clearing flag to propagate through
hardware in order not to accidentally fire off any interrupts
when we enable them.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>

5973bee4

Merge branch 'misc-2.6.31' into release · 3be4ee51
Len Brown authored Aug 02, 2009

3be4ee51
Merge branch 'bugzilla-13825' into release · 95452a6c
Len Brown authored Aug 02, 2009

95452a6c

eeepc-laptop: fix hot-unplug on resume · 7334546a

Alan Jenkins authored Jun 29, 2009

OOPS on resume when the wireless adaptor is disabled during suspend was
introduced by "eeepc-laptop: read rfkill soft-blocked state on resume".

Unable to handle kernel NULL pointer dereference

Process s2disk
Tainted: G W
IP: klist_put

Call trace:
? klist_del
? device_del
? device_unregister
? pci_stop_dev
? pci_stop_bus
? pci_remove_device
? eeepc_rfkill_hotplug [eeepc_laptop]
? eeepc_hotk_resume [eeepc_laptop]
? acpi_device_resume
? device_resume
? hibernation_snapshot

It appears the PCI device is removed twice.  The eeepc_rfkill_hotplug()
call from the resume handler is racing against the call from the ACPI
notifier callback.  The ACPI notification is triggered by the resume
handler when it refreshes the value of CM_ASL_WLAN.

The fix is to serialize hotplug calls using a workqueue.

http://bugzilla.kernel.org/show_bug.cgi?id=13825Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Corentin Chary <corentin.chary@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>

7334546a

Merge branch 'memhotplug-crash' into release · a571a79a
Len Brown authored Aug 02, 2009

a571a79a

ACPI: Ingore the memory block with zero block size in course of memory hotplug · 5d2619fc

Zhao Yakui authored Jul 07, 2009

If the memory block size is zero, ignore it and don't do the memory hotplug
flowchart. Otherwise it will complain the following warning message:
  >System RAM resource 0 - ffffffffffffffff cannot be added
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

5d2619fc

ACPI: Don't treat generic error as ACPI error code in acpi memory hotplug driver · aa7b2b2e

Zhao Yakui authored Jul 03, 2009

Don't treat the generic error as ACPI error code. Otherwise when the generic
code is returned, it will complain the following warning messag:
   >ACPI Exception (acpi_memhotplug-0171): UNKNOWN_STATUS_CODE,
		Cannot get acpi bus device [20080609]
   >ACPI: Cannot find driver data
   > ACPI Error (utglobal-0127): Unknown exception code: 0xFFFFFFED [20080609]
   > Pid: 85, comm: kacpi_notify Not tainted 2.6.27.19-5-default #1
     Call Trace:
     [<ffffffff8020da29>] show_trace_log_lvl+0x41/0x58
     [<ffffffff8049a3da>] dump_stack+0x69/0x6f
    .....

At the same time when the generic error code is returned, the ACPI_EXCEPTION
is replaced by the printk.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

aa7b2b2e

Merge branch 'bugzilla-13751' into release · 6a614877
Len Brown authored Aug 02, 2009

6a614877

ACPI: bind workqueues to CPU 0 to avoid SMI corruption · 74b58208

Bjorn Helgaas authored Jul 29, 2009

On some machines, a software-initiated SMI causes corruption unless the
SMI runs on CPU 0.  An SMI can be initiated by any AML, but typically it's
done in GPE-related methods that are run via workqueues, so we can avoid
the known corruption cases by binding the workqueues to CPU 0.

References:
    http://bugzilla.kernel.org/show_bug.cgi?id=13751
    https://bugs.launchpad.net/bugs/157171
    https://bugs.launchpad.net/bugs/157691Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>

74b58208

Merge branch 'thinkpad' into release · f63440ef
Len Brown authored Aug 02, 2009

f63440ef
Merge branch 'bugzilla-13865' into release · 437f8c8a
Len Brown authored Aug 02, 2009

437f8c8a
Merge branch 'bugzilla-13620-revert' into release · b8a848ed
Len Brown authored Aug 02, 2009

b8a848ed
ACPI: root-only read protection on /sys/firmware/acpi/tables/* · d0006f32
Len Brown authored Jul 30, 2009
```
they were world readable.
Signed-off-by: Len Brown <len.brown@intel.com>
```
d0006f32

thinkpad-acpi: fix incorrect use of TPACPI_BRGHT_MODE_ECNVRAM · 59fe4fe3

Henrique de Moraes Holschuh authored Aug 01, 2009

HBRV-based default selection of backlight control strategy didn't work
well, at least the X41 defines it but doesn't use it and I don't think
it will stop there.

Switch to a white/blacklist.  All models that have HBRV defined have
been included in the list, and initially all ATI GPUs will get
ECNVRAM, and the Intel GPUs will get UCMS_STEP.

Symptoms of incorrect backlight mode selection are:

1. Non-working backlight control through sysfs;

2. Backlight gets reset to the lowest level at every shutdown, reboot
   and when thinkpad-acpi gets unloaded;

This fixes a regression in 2.6.30, bugzilla #13826
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Reported-by: Tobias Diedrich <ranma+kernel@tdiedrich.de>
Cc: stable@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>

59fe4fe3

thinkpad-acpi: restrict procfs count value to sane upper limit · 5b05d469

Michael Buesch authored Aug 01, 2009

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Len Brown <len.brown@intel.com>

5b05d469

thinkpad-acpi: remove dock and bay subdrivers · 1f6fc2de

Henrique de Moraes Holschuh authored Aug 01, 2009

The standard ACPI dock driver can handle the hotplug bays and docks of
the ThinkPads just fine (including batteries) as of 2.6.27, and the
code in thinkpad-acpi for the dock and bay subdrivers is currently
broken anyway...

Userspace needs some love to support the two-stage ejection nicely,
but it is simple enough to do through udev rules (you don't even need
HAL) so this wouldn't justify fixing the dock and bay subdrivers,
either.

That leaves warm-swap bays (_EJ3) support for thinkpad-acpi, as well
as support for the weird dock of the model 570, but since such support
has never left the "experimental" stage, it is also not a strong
enough reason to find a way to fix this code.

Users of ThinkPads with warm-swap bays are urged to request that _EJ3
support be added to the regular ACPI dock driver, if such feature is
indeed useful for them.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Len Brown <len.brown@intel.com>

1f6fc2de

thinkpad-acpi: disable broken bay and dock subdrivers · 550e7fd8

Henrique de Moraes Holschuh authored Aug 01, 2009

Currently, the ThinkPad-ACPI bay and dock drivers are completely
broken, and cause a NULL pointer derreference in kernel mode (and,
therefore, an OOPS) when they try to issue events (i.e. on dock,
undock, bay ejection, etc).

OTOH, the standard ACPI dock driver can handle the hotplug bays and
docks of the ThinkPads just fine (including batteries) as of 2.6.27.
In fact, it does a much better job of it than thinkpad-acpi ever did.

It is just not worth the hassle to find a way to fix this crap without
breaking the (deprecated) thinkpad-acpi dock/bay ABI.  This is old,
deprecated code that sees little testing or use.

As a quick fix suitable for -stable backports, mark the thinkpad-acpi
bay and dock subdrivers as BROKEN in Kconfig.  The dead code will be
removed by a later patch.

This fixes bugzilla #13669, and should be applied to 2.6.27 and later.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Reported-by: Joerg Platte <jplatte@naasa.net>
Cc: stable@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>

550e7fd8

01 Aug, 2009 3 commits

do_sigaltstack: small cleanups · 0dd8486b

Linus Torvalds authored Aug 01, 2009

The previous commit ("do_sigaltstack: avoid copying 'stack_t' as a
structure to user space") fixed a real bug. This one just cleans up the
copy from user space to that gcc can generate better code for it (and so
that it looks the same as the later copy back to user space).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0dd8486b

do_sigaltstack: avoid copying 'stack_t' as a structure to user space · 0083fc2c

Linus Torvalds authored Aug 01, 2009

Ulrich Drepper correctly points out that there is generally padding in
the structure on 64-bit hosts, and that copying the structure from
kernel to user space can leak information from the kernel stack in those
padding bytes.

Avoid the whole issue by just copying the three members one by one
instead, which also means that the function also can avoid the need for
a stack frame.  This also happens to match how we copy the new structure
from user space, so it all even makes sense.

[ The obvious solution of adding a memset() generates horrid code, gcc
  does really stupid things. ]
Reported-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0083fc2c

Linux 2.6.31-rc5 · ed680c4a
Linus Torvalds authored Jul 31, 2009

ed680c4a

31 Jul, 2009 2 commits

Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · f5266cbd

Linus Torvalds authored Jul 31, 2009

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: bump up nr_to_write in xfs_vm_writepage
  xfs: reduce bmv_count in xfs_vn_fiemap

f5266cbd

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · a5bc92cd

Linus Torvalds authored Jul 31, 2009

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  io context: fix ref counting
  block: make the end_io functions be non-GPL exports
  block: fix improper kobject release in blk_integrity_unregister
  block: always assign default lock to queues
  mg_disk: Add missing ready status check on mg_write()
  mg_disk: fix issue with data integrity on error in mg_write()
  mg_disk: fix reading invalid status when use polling driver
  mg_disk: remove prohibited sleep operation

a5bc92cd