Commits · 73ec5177a1bb2f71a61b19623d69f8857c8450fe · linux / linux-davinci

12 Oct, 2009 2 commits

The kernel offers with TIOCL_GETKMSGREDIRECT ioctl() the possibility to · 73ec5177

Bernhard Walle authored Oct 13, 2009

redirect the kernel messages to a specific console.

However, since it's not possible to switch to the kernel message console
after a panic(), it would be nice if the kernel would print the panic
message on the current console.
Signed-off-by: Bernhard Walle <bernhard@bwalle.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

73ec5177

The kernel offers with TIOCL_GETKMSGREDIRECT ioctl() the possibility to · 08d86a94

Bernhard Walle authored Oct 13, 2009

redirect the kernel messages to a specific console.

However, since it's not possible to switch to the kernel message console
after a panic(), it would be nice if the kernel would print the panic
message on the current console.

This patch series adds a new interface to access the global kmsg_redirect
variable by a function to be able to use it in code where
CONFIG_VT_CONSOLE is not set (kernel/panic.c).



This patch:

Instead of using and exporting a global value kmsg_redirect, introduce a
function vt_kmsg_redirect() that both can set and return the console where
messages are printed.

Change all users of kmsg_redirect (the VT code itself and kernel/power.c)
to the new interface.

The main advantage is that vt_kmsg_redirect() can also be used when
CONFIG_VT_CONSOLE is not set.
Signed-off-by: Bernhard Walle <bernhard@bwalle.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

08d86a94

09 Oct, 2009 1 commit

/usr/src/devel/arch/x86/Makefile:82: stack protector enabled but no compiler support · d64cabd2

Andrew Morton authored Oct 09, 2009

/usr/src/devel/arch/x86/Makefile:82: stack protector enabled but no compiler support
In file included from crypto/api.c:18:
include/linux/err.h: In function 'IS_ERR_OR_NULL':
include/linux/err.h:39: error: 'NULL' undeclared (first use in this function)
include/linux/err.h:39: error: (Each undeclared identifier is reported only once
include/linux/err.h:39: error: for each function it appears in.)

Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

d64cabd2

29 Sep, 2009 1 commit

There are quite a few instances in the kernel of checks of pointers both · 1fdba762

Phil Carmody authored Sep 30, 2009

against NULL and against the errno range, handling both cases identically.
This additional helper function would simplify such code.
Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1fdba762

15 Oct, 2009 1 commit

According to feature-removal-schedule.txt, it is the time to remove · dd7c81b6

Amerigo Wang authored Oct 16, 2009

print_fn_descriptor_symbol().

And a quick grep shows that it no longer has any callers.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

dd7c81b6

16 Oct, 2009 2 commits

simplify · 5ed7c366

Andrew Morton authored Oct 16, 2009

Cc: Amerigo Wang <amwang@redhat.com>
Cc: Ben Woodard <bwoodard@llnl.gov>
Cc: Brian Behlendorf <behlendorf1@llnl.gov>
Cc: David Howells <dhowells@redhat.com>
Cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

5ed7c366

rwsem_is_locked() tests ->activity without locks, so we should always keep · 75634cc8

Amerigo Wang authored Oct 16, 2009

->activity consistent.  However, the code in __rwsem_do_wake() breaks this
rule, it updates ->activity after _all_ readers waken up, this may give
some reader a wrong ->activity value, thus cause rwsem_is_locked() behaves
wrong.

Quote from Andrew:

"
- we have one or more processes sleeping in down_read(), waiting for access.

- we wake one or more processes up without altering ->activity

- they start to run and they do rwsem_is_locked().  This incorrectly
  returns "false", because the waker process is still crunching away in
  __rwsem_do_wake().

- the waker now alters ->activity, but it was too late.
"

So we need get a spinlock to protect this.  And rwsem_is_locked() should
not block, thus we use spin_trylock_irqsave().
Reported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Cc: Ben Woodard <bwoodard@llnl.gov>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

75634cc8

14 Oct, 2009 2 commits

These functions need not to be exported, since no drivers should use them. · 3719a33e

Amerigo Wang authored Oct 14, 2009

__init_rwsem() is an exception, because init_rwsem(), which is a macro,
is used.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

3719a33e

Don't initialize __print_once. Invert the test to reduce initialized · 52645121

Joe Perches authored Oct 14, 2009

data.

defconfig before:	$size vmlinux
   text	   data	    bss	    dec	    hex	filename
6976022	 679572	1359668	9015262	 898fde	vmlinux

defconfig after:	$size vmlinux
   text	   data	    bss	    dec	    hex	filename
6976006	 679508	1359700	9015214	 898fae	vmlinux
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

52645121

13 Oct, 2009 2 commits

The symbol 'call' is a static symbol used for initcall_debug. This same · 03244f00

H Hartley Sweeten authored Oct 14, 2009

symbol name is used locally by a couple functions and produces the
following sparse warnings:

	warning: symbol 'call' shadows an earlier one

Fix this noise by renaming the local symbols.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

03244f00

On Mon, Oct 12, 2009 at 12:31:46PM -0400, H Hartley Sweeten wrote: · 2498f9f5

Daniel Mack authored Oct 13, 2009

> On Wednesday, October 07, 2009 1:01 PM, Daniel Mack wrote:
> > This is actually too trivial to publish, but to export the function of
> > that chip to the userspace, a module like this is needed.
> >
> > Signed-off-by: Daniel Mack <daniel@caiaq.de>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: David Brownell <dbrownell@users.sourceforge.net>
> > Cc: spi-devel-general@lists.sourceforge.net
> > ---
>
> [snip]
>
> > +static ssize_t dac7512_store_val(struct device *dev,
> > +				 struct device_attribute *attr,
> > +				 const char *buf, size_t count)
> > +{
> > +	struct spi_device *spi = to_spi_device(dev);
> > +	unsigned char tmp[2];
> > +	unsigned long val;
> > +
> > +	if (strict_strtoul(buf, 10, &val) < 0)
> > +		return -EINVAL;
> > +
> > +	tmp[0] = val >> 8;
> > +	tmp[1] = val & 0xff;
> > +	spi_write(spi, tmp, sizeof(tmp));
> > +	return count;
> > +}
> > +
> > +static DEVICE_ATTR(value, S_IWUSR | S_IRUGO,
> > +		   NULL, dac7512_store_val);
>
> You have declared the "value" device attribute with mode S_IWUSR | S_IRUGO
> but have not provided a show callback.

Sorry, forget my last mail, I got you wrong. You're of course right,
S_IRUGO shouldn't be set for write-only attributes. Updates patch below.

Thanks,
Daniel

>From ab18a967e55d2bb1d39559333bca81a01c2838f0 Mon Sep 17 00:00:00 2001
Date: Thu, 8 Oct 2009 03:55:46 +0800
Subject: [PATCH] drivers/misc: add driver for Texas Instruments DAC7512
This is actually too trivial to publish, but to export the function of
that chip to the userspace, a module like this is needed.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: "H Hartley Sweeten" <hartleys@visionengravers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

2498f9f5

26 Oct, 2009 1 commit

Signed-off-by: Daniel Mack <daniel@caiaq.de> · 1b325e13

Daniel Mack authored Oct 26, 2009

Cc: "H Hartley Sweeten" <hartleys@visionengravers.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1b325e13

24 Sep, 2009 1 commit

Use smp_processor_id() instead of get_cpu() and put_cpu() in · 1c16efdd

Xiao Guangrong authored Sep 24, 2009

generic_smp_call_function_interrupt(), It's no need to disable preempt,
because we must call generic_smp_call_function_interrupt() with interrupts
disabled.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1c16efdd

10 Oct, 2009 1 commit

commit ("qnx4: remove write support")... · 047ba4ce

Anders Larsen authored Oct 10, 2009

commit 945ffe54 ("qnx4: remove write support") removed the (defunct)
write support but missed a chunk of related, dead code.
Signed-off-by: Anders Larsen <al@alarsen.net>
Cc:  Jiri Kosina <jkosina@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

047ba4ce

09 Oct, 2009 2 commits

This driver supports the non-volatile digital potentiometers via I2C: · 5788030a

Michael Hennerich authored Oct 09, 2009

AD5258, AD5259, AD5251, AD5252, AD5253, AD5254, and AD5255

It provides a sysfs interface to each device for reading/writing which
is documented in Documentation/misc-devices/ad525x_dpot.txt.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Chris Verges <chrisv@cyberswitching.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

5788030a

If CONFIG_DYNAMIC_DEBUG is enabled and a source file has: · a0c3a9ce

Joe Perches authored Oct 09, 2009

#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/kernel.h>

dynamic_debug.h will duplicate KBUILD_MODNAME
in the output string.

Remove the use of KBUILD_MODNAME from the
output format string generated by dynamic_debug.h

If CONFIG_DYNAMIC_DEBUG is not enabled, no compile-time
check is done to printk/dev_printk arguments.

Add it.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

a0c3a9ce

30 Sep, 2009 2 commits

Commit ("printk_once(): use bool · a5b87b5b

Cesar Eduardo Barros authored Oct 01, 2009

for boolean flag") changed printk_once() to use bool instead of int for
its guard variable.  Do the same change to WARN_ONCE() and WARN_ON_ONCE(),
for the same reasons.

This resulted in a reduction of 1462 bytes on a x86-64 defconfig:

   text    data     bss     dec     hex filename
8101271 1207116  992764 10301151         9d2edf vmlinux.before
8100553 1207148  991988 10299689         9d2929 vmlinux.after
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

a5b87b5b

ktime will overflow from 03:14:07 UTC on Tuesday, 19 January 2038, · 1fcb4a3d

Barry Song authored Sep 30, 2009

ktime_add() in timecompare_update() will overflow a half earlier.  As a
result, wrong offset will be gotten, then cause some strange problems.
Signed-off-by: Barry Song <21cnbao@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Patrick Ohly <patrick.ohly@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1fcb4a3d

12 Oct, 2009 1 commit

add WARN_ON() · 2bfbb11b

Andrew Morton authored Oct 12, 2009

Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

2bfbb11b

13 Oct, 2009 1 commit

gcc is not convinced that the floppy.c ioctl has sufficient bound checks: · 8c262146

Arjan van de Ven authored Oct 13, 2009

In function `copy_from_user',
    inlined from `fd_copyin' at drivers/block/floppy.c:3080,
    inlined from `fd_ioctl' at drivers/block/floppy.c:3503:
/home/arjan/linux/arch/x86/include/asm/uaccess_32.h:211:
warning: call to `copy_from_user_overflow' declared with attribute
warning: copy_from_user buffer size is not provably correct

And frankly, as a human I have a hard time proving the same more or less
(the size comes from the ioctl argument.  humpf.  maybe.  the code isn't
very nice)

This patch adds an explicit check to make 100% sure it's safe, better than
finding out later that there indeed was a gap.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

8c262146

12 Oct, 2009 1 commit

size_t desc_len cannot be less than 0, test before the subtraction. · 20011a18

Roel Kluin authored Oct 13, 2009

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

20011a18

23 Jul, 2009 1 commit

Convert cris to use GENERIC_TIME via the arch_getoffset() infrastructure, · bbf1fad8

john stultz authored Jul 23, 2009

reducing the amount of arch specific code we need to maintain.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

bbf1fad8

14 Feb, 2009 2 commits

ERROR: space required after that ',' (ctx:VxV) · b26f3588

Andrew Morton authored Feb 14, 2009

#23: FILE: arch/frv/kernel/gdb-stub.c:1883:
+				gdbstub_strcpy(output_buffer,"E02");
 				                            ^

ERROR: space required after that ',' (ctx:VxV)
#32: FILE: arch/frv/kernel/gdb-stub.c:1911:
+				gdbstub_strcpy(output_buffer,"E02");
 				                            ^

total: 2 errors, 0 warnings, 16 lines checked

./patches/frv-duplicate-output_buffer-of-e03.patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

b26f3588

In the case of an error, output_buffer gets an E0X value. E03 was set in · 1d8a5956

Roel Kluin authored Feb 14, 2009

two.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1d8a5956

28 Oct, 2009 1 commit

Some newer Lenovo models are shipped with a TPM that doesn't seem to set · c6a75f83

Rajiv Andrade authored Oct 29, 2009

the TPM_STS_DATA_EXPECT status bit when sending it a burst of data, so the
code understands it as a failure and doesn't proceed sending the chip the
intended data.  In this patch we bypass this bit check in case the itpm
module parameter was set.

This patch is based on Andy Isaacson's one:

http://marc.info/?l=linux-kernel&m=124650185023495&w=2

It was heavily discussed how should we deal with identifying the chip in
kernel space, but the required patch to do so was NACK'd:

http://marc.info/?l=linux-kernel&m=124650186423711&w=2

This way we let the user choose using this workaround or not based on his
observations on this code behavior when trying to use the TPM.

Fixed a checkpatch issue present on the previous patch, thanks to Daniel Walker.
Signed-off-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Tested-by: Seiji Munetoh <seiji.munetoh@gmail.com>

David Smith <david.daniel.smith@gmail.com>
Cc: Seiji Munetoh <seiji.munetoh@gmail.com>
Cc: Andy Isaacson <adi@vmware.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Shahbaz Khan <shaz.linux@gmail.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

c6a75f83

31 Oct, 2009 1 commit

It's reported that OOM-Killer kills Gnone/KDE first. And yes, we can · cdd5ac71

KAMEZAWA Hiroyuki authored Oct 31, 2009

reproduce it easily.

Now, oom-killer uses mm->total_vm as its base value.  But in recent
applications, there are a big gap between VM size and RSS size.  Because

  - Applications attaches much dynamic libraries. (Gnome, KDE, etc...)
  - Applications may alloc big VM area but use small part of them.
    (Java, and multi-threaded applications has this tendency because
     of default-size of stack.)

I think using mm->total_vm as score for oom-kill is not good.  By the same
reason, overcommit memory can't work as expected.  (In other words, if we
depends on total_vm, using overcommit more positive is a good choice.)

This patch uses mm->anon_rss/file_rss as base value for calculating badness.

Following is changes to OOM score(badness) on an environment with 1.6G memory
plus memory-eater(500M & 1G).

Top 10 of badness score. (The highest one is the first candidate to be killed)
Before
badness program
91228	gnome-settings-
94210	clock-applet
103202	mixer_applet2
106563	tomboy
112947	gnome-terminal
128944	mmap              <----------- 500M malloc
129332	nautilus
215476	bash              <----------- parent of 2 mallocs.
256944	mmap              <----------- 1G malloc
423586	gnome-session

After
badness
1911	mixer_applet2
1955	clock-applet
1986	xinit
1989	gnome-session
2293	nautilus
2955	gnome-terminal
4113	tomboy
104163	mmap             <----------- 500M malloc.
168577	bash             <----------- parent of 2 mallocs
232375	mmap             <----------- 1G malloc

seems good for me.  Maybe we can tweak this patch more, but this one will
be a good one as a start point.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cdd5ac71

16 Oct, 2009 2 commits

Fix the comment for try_to_unmap_anon() with the new arguments. · 486df07b

Huang Shijie authored Oct 17, 2009

Signed-off-by: Huang Shijie <shijie8@gmail.com>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

486df07b

Commit ("Streamline generic_file_* interfaces and filemap · 3049dbbe

Vincent Li authored Oct 17, 2009

cleanups") removed generic_file_write() in filemap.  Change the comment in
vmscan pageout() to __generic_file_aio_write().
Signed-off-by: Vincent Li <macli@brc.ubc.ca>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

3049dbbe

15 Oct, 2009 2 commits

Reorder (and comment) the fields of swap_info_struct, to make better · f1a03c27

Hugh Dickins authored Oct 16, 2009

use of its cachelines: it's good for swap_duplicate() in particular
if unsigned int max and swap_map are near the start.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

f1a03c27

While we're fiddling with the swap_map values, let's assign a particular · 67951059

Hugh Dickins authored Oct 16, 2009

value to shmem/tmpfs swap pages: their swap counts are never incremented,
and it helps swapoff's try_to_unuse() a little if it can immediately
distinguish those pages from process pages.

Since we've no use for SWAP_MAP_BAD | COUNT_CONTINUED,
we might as well use that 0xbf value for SWAP_MAP_SHMEM.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

67951059

16 Oct, 2009 1 commit

Swap is duplicated (reference count incremented by one) whenever the same · 1f975884

Hugh Dickins authored Oct 17, 2009

swap page is inserted into another mm (when forking finds a swap entry in
place of a pte, or when reclaim unmaps a pte to insert the swap entry).

swap_info_struct's vmalloc'ed swap_map is the array of these reference
counts: but what happens when the unsigned short (or unsigned char since
the preceding patch) is full? (and its high bit is kept for a cache flag)

We then lose track of it, never freeing, leaving it in use until swapoff:
at which point we _hope_ that a single pass will have found all instances,
assume there are no more, and will lose user data if we're wrong.

Swapping of KSM pages has not yet been enabled; but it is implemented,
and makes it very easy for a user to overflow the maximum swap count:
possible with ordinary process pages, but unlikely, even when pid_max
has been raised from PID_MAX_DEFAULT.

This patch implements swap count continuations: when the count overflows,
a continuation page is allocated and linked to the original vmalloc'ed
map page, and this used to hold the continuation counts for that entry
and its neighbours. These continuation pages are seldom referenced:
the common paths all work on the original swap_map, only referring to
a continuation page when the low "digit" of a count is incremented or
decremented through SWAP_MAP_MAX.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1f975884

15 Oct, 2009 2 commits

Halve the vmalloc'ed swap_map array from unsigned shorts to unsigned · 48e7dbf5

Hugh Dickins authored Oct 16, 2009

chars: it's still very unusual to reach a swap count of 126, and the
next patch allows it to be extended indefinitely.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

48e7dbf5

Though swap_count() is useful, I'm finding that swap_has_cache() and · 024153d4

Hugh Dickins authored Oct 16, 2009

encode_swapmap() obscure what happens in the swap_map entry, just at
those points where I need to understand it.  Remove them, and pass
more usable "usage" values to scan_swap_map(), swap_entry_free() and
__swap_duplicate(), instead of the SWAP_MAP and SWAP_CACHE enum.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

024153d4

16 Oct, 2009 1 commit

Move CONFIG_HIBERNATION's swapdev_block() into the main CONFIG_HIBERNATION · afb506e5

Hugh Dickins authored Oct 16, 2009

block, remove extraneous whitespace and return, fix typo in a comment.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

afb506e5

15 Oct, 2009 3 commits

Make better use of the space by folding first swap_extent into its · d22a2661

Hugh Dickins authored Oct 16, 2009

swap_info_struct, instead of just the list_head: swap partitions need
only that one, and for others it's used as a circular list anyway.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

d22a2661

The swap_info_struct is only 76 or 104 bytes, but it does seem wrong · 86a8ffe2

Hugh Dickins authored Oct 16, 2009

to reserve an array of about 30 of them in bss, when most people will
want only one.  Change swap_info[] to an array of pointers.

That does need a "type" field in the structure: pack it as a char with
next type and short prio (aha, char is unsigned by default on PowerPC).
Use the (admittedly peculiar) name "type" throughout for this index.

/proc/swaps does not take swap_lock: I wouldn't want it to, but do take
care with barriers when adding a new item to the array (never removed).
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

86a8ffe2

The swap_info_struct is mostly private to mm/swapfile.c, with only · 83eac6f8

Hugh Dickins authored Oct 16, 2009

one other in-tree user: get_swap_bio().  Adjust its interface to
map_swap_page(), so that we can then remove get_swap_info_struct().

But there is a popular user out-of-tree, TuxOnIce: so leave the
declaration of swap_info_struct in linux/swap.h.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Nigel Cunningham <ncunningham@crca.org.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

83eac6f8

13 Oct, 2009 3 commits

- avoid wasting more precious resources (DMA or DMA32 pools), when · cde4016d

Jan Beulich authored Oct 13, 2009

  being called through vmalloc_32{,_user}()
- explicitly allow using high memory here even if the outer allocation
  request doesn't allow it
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cde4016d

Objects passed to NODEMASK_ALLOC() are relatively small in size and are · b763f773

David Rientjes authored Oct 13, 2009

backed by slab caches that are not of large order, traditionally never
greater than PAGE_ALLOC_COSTLY_ORDER.

Thus, using GFP_KERNEL for these allocations on large machines when
CONFIG_NODES_SHIFT > 8 will cause the page allocator to loop endlessly in
the allocation attempt, each time invoking both direct reclaim or the oom
killer.

This is of particular interest when using NODEMASK_ALLOC() from a
mempolicy context (either directly in mm/mempolicy.c or the mempolicy
constrained hugetlb allocations) since the oom killer always kills current
when allocations are constrained by mempolicies.  So for all present use
cases in the kernel, current would end up being oom killed when direct
reclaim fails.  That would allow the NODEMASK_ALLOC() to succeed but
current would have sacrificed itself upon returning.

This patch adds gfp flags to NODEMASK_ALLOC() to pass to kmalloc() on
CONFIG_NODES_SHIFT > 8; this parameter is a nop on other configurations. 
All current use cases either directly from hugetlb code or indirectly via
NODEMASK_SCRATCH() union __GFP_NORETRY to avoid direct reclaim and the oom
killer when the slab allocator needs to allocate additional pages.

The side-effect of this change is that all current use cases of either
NODEMASK_ALLOC() or NODEMASK_SCRATCH() need appropriate -ENOMEM handling
when the allocation fails (never for CONFIG_NODES_SHIFT <= 8).  All
current use cases were audited and do have appropriate error handling at
this time.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

b763f773

Offload the registration and unregistration of per node hstate sysfs · 3f5cc391

Lee Schermerhorn authored Oct 13, 2009

attributes to a worker thread rather than attempt the
allocation/attachment or detachment/freeing of the attributes in the
context of the memory hotplug handler.

I don't know that this is absolutely required, but the registration can
sleep in allocations and other mem hot plug handlers do it this way.  If
it turns out this is NOT required, we can drop this patch.

N.B.,  Only tested build, boot, libhugetlbfs regression.
       i.e., no memory hotplug testing.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

3f5cc391