- 22 Mar, 2006 40 commits
-
-
Oleg Nesterov authored
If get_next_ra_size() does not grow fast enough, ->prev_page can overrun the ahead window. This means the caller will read the pages from ->ahead_start + ->ahead_size to ->prev_page synchronously. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Steven Pratt <slpratt@austin.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
shmem.c was named and shamed in Jesper's "Building 100 kernels" warnings: shmem_parse_mpol is only used when CONFIG_TMPFS parses mount options; and only called from that one site, so mark it inline like its non-NUMA stub. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
As suggested by Marcelo: 1. The optimization introduced recently for not calling page_referenced() during zone reclaim makes two additional checks in shrink_list unnecessary. 2. The if (unlikely(sc->may_swap)) in refill_inactive_zone is optimized for the zone_reclaim case. However, most peoples system only does swap. Undo that. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Marcelo Tosatti <marcelo.tosatti@cyclades.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Michael Buesch authored
Remove the inlining of the new vs old mmap system call common code. This reduces the size of the resulting vmlinux for defconfig as follows: mb@pc1:~/develop/git/linux-2.6$ size vmlinux.mmap* text data bss dec hex filename 3303749 521524 186564 4011837 3d373d vmlinux.mmapinline 3303557 521524 186564 4011645 3d367d vmlinux.mmapnoinline The new sys_mmap2() has also one function call overhead removed, now. (probably it was already optimized to a jmp before, but anyway...) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Optimise page_count compound page test and make it consistent with similar functions. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Put a few more checks under CONFIG_DEBUG_VM Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
prep_zero_page() uses KM_USER0 and hence may not be used from IRQ context, at least for highmem pages. Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <christoph@lameter.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Move the prep_ stuff into prep_new_page. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
set_page_count usage outside mm/ is limited to setting the refcount to 1. Remove set_page_count from outside mm/, and replace those users with init_page_count() and set_page_refcounted(). This allows more debug checking, and tighter control on how code is allowed to play around with page->_count. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
A couple of places set_page_count(page, 1) that don't need to. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Now that compound page handling is properly fixed in the VM, move nommu over to using compound pages rather than rolling their own refcounting. nommu vm page refcounting is broken anyway, but there is no need to have divergent code in the core VM now, nor when it gets fixed. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: David Howells <dhowells@redhat.com> (Needs testing, please). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Remove __put_page from outside the core mm/. It is dangerous because it does not handle compound pages nicely, and misses 1->0 transitions. If a user later appears that really needs the extra speed we can reevaluate. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Remove page_count and __put_page from x86-64 pageattr Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Use page->lru.next to implement the singly linked list of pages rather than the struct deferred_page which needs to be allocated and freed for each page. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Stop using __put_page and page_count in i386 pageattr.c Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
sg increments the refcount of constituent pages in its higher order memory allocations when they are about to be mapped by userspace. This is done so the subsequent get_page/put_page when doing the mapping and unmapping does not free the page. Move over to the preferred way, that is, using compound pages instead. This fixes a whole class of possible obscure bugs where a get_user_pages on a constituent page may outlast the user mappings or even the driver. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Cc: Douglas Gilbert <dougg@torque.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
Now that it's madvisable, remove two pieces of VM_DONTCOPY bogosity: 1. There was and is no logical reason why VM_DONTCOPY should be in the list of flags which forbid vma merging (and those drivers which set it are also setting VM_IO, which itself forbids the merge). 2. It's hard to understand the purpose of the VM_HUGETLB, VM_DONTCOPY block in vm_stat_account: but never mind, it's under CONFIG_HUGETLB, which (unlike CONFIG_HUGETLB_PAGE or CONFIG_HUGETLBFS) has never been defined. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Wu Fengguang authored
In shrink_inactive_list(), nr_scan is not accounted when nr_taken is 0. But 0 pages taken does not mean 0 pages scanned. Move the goto statement below the accounting code to fix it. Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Wu Fengguang authored
In isolate_lru_pages(), *scanned reports one more scan because the scan counter is increased one more time on exit of the while-loop. Change the while-loop to for-loop to fix it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Add some comments to explain how zone reclaim works. And it fixes the following issues: - PF_SWAPWRITE needs to be set for RECLAIM_SWAP to be able to write out pages to swap. Currently RECLAIM_SWAP may not do that. - remove setting nr_reclaimed pages after slab reclaim since the slab shrinking code does not use that and the nr_reclaimed pages is just right for the intended follow up action. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
We have: try_to_free_pages ->shrink_caches(struct zone **zones, ..) ->shrink_zone(struct zone *, ...) ->shrink_cache(struct zone *, ...) ->shrink_list(struct list_head *, ...) ->refill_inactive_list((struct zone *, ...) which is fairly irrational. Rename things so that we have try_to_free_pages ->shrink_zones(struct zone **zones, ..) ->shrink_zone(struct zone *, ...) ->shrink_inactive_list(struct zone *, ...) ->shrink_page_list(struct list_head *, ...) ->shrink_active_list(struct zone *, ...) Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <christoph@lameter.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Change all the vmscan functions to retunr the number-of-reclaimed pages and remove scan_conrtol.nr_reclaimed. Saves ten-odd bytes of text and makes things clearer and more consistent. The patch also changes the behaviour of zone_reclaim() when it falls back to slab shrinking. Christoph says "Setting this to one means that we will rescan and shrink the slab for each allocation if we are out of zone memory and RECLAIM_SLAB is set. Plus if we do an order 0 allocation we do not go off node as intended. "We better set this to zero. This means the allocation will go offnode despite us having potentially freed lots of memory on the zone. Future allocations can then again be done from this zone." Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <christoph@lameter.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Turn basically everything in vmscan.c into `unsigned long'. This is to avoid the possibility that some piece of code in there might decide to operate upon more than 4G (or even 2G) of pages in one hit. This might be silly, but we'll need it one day. Cc: Christoph Lameter <clameter@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Initialise as much of scan_control as possible at the declaration site. This tidies things up a bit and assures us that all unmentioned fields are zeroed out. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Make nr_to_scan and priority a parameter instead of putting it into scan control. This allows various small optimizations and IMHO makes the code easier to read. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Slab duplicates on_each_cpu(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
When on_each_cpu() runs the callback on other CPUs, it runs with local interrupts disabled. So we should run the function with local interrupts disabled on this CPU, too. And do the same for UP, so the callback is run in the same environment on both UP and SMP. (strictly it should do preempt_disable() too, but I think local_irq_disable is sufficiently equivalent). Also uninlines on_each_cpu(). softirq.c was the most appropriate file I could find, but it doesn't seem to justify creating a new file. Oh, and fix up that comment over (under?) x86's smp_call_function(). It drives me nuts. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
SLAB_NO_REAP is documented as an option that will cause this slab not to be reaped under memory pressure. However, that is not what happens. The only thing that SLAB_NO_REAP controls at the moment is the reclaim of the unused slab elements that were allocated in batch in cache_reap(). Cache_reap() is run every few seconds independently of memory pressure. Could we remove the whole thing? Its only used by three slabs anyways and I cannot find a reason for having this option. There is an additional problem with SLAB_NO_REAP. If set then the recovery of objects from alien caches is switched off. Objects not freed on the same node where they were initially allocated will only be reused if a certain amount of objects accumulates from one alien node (not very likely) or if the cache is explicitly shrunk. (Strangely __cache_shrink does not check for SLAB_NO_REAP) Getting rid of SLAB_NO_REAP fixes the problems with alien cache freeing. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Randy Dunlap authored
Fix kernel-doc warnings in mm/slab.c. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Pekka Enberg authored
We have struct kmem_cache now so use it instead of the old typedef. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ravikiran G Thirumalai authored
Remove cachep->spinlock. Locking has moved to the kmem_list3 and most of the structures protected earlier by cachep->spinlock is now protected by the l3->list_lock. slab cache tunables like batchcount are accessed always with the cache_chain_mutex held. Patch tested on SMP and NUMA kernels with dbench processes running, constant onlining/offlining, and constant cache tuning, all at the same time. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Christoph Lameter <christoph@lameter.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
slab.c has become a bit revolting again. Try to repair it. - Coding style fixes - Don't do assignments-in-if-statements. - Don't typecast assignments to/from void* Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Pekka Enberg authored
Extract setup_cpu_cache() function from kmem_cache_create() to make the latter a little less complex. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Pekka Enberg authored
Clean up the object to index mapping that has been spread around mm/slab.c. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Adrian Bunk authored
Since size_t has the same size as a long on all architectures, it's enough for overflow checks to check against ULONG_MAX. This change could allow a compiler better optimization (especially in the n=1 case). The practical effect seems to be positive, but quite small: text data bss dec hex filename 21762380 5859870 1848928 29471178 1c1b1ca vmlinux-old 21762211 5859870 1848928 29471009 1c1b121 vmlinux-patched Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Insert "fresh" huge pages into the hugepage allocator by the same means as they are freed back into it. This reduces code size and allows enqueue_huge_page to be inlined into the hugepage free fastpath. Eliminate occurances of hugepages on the free list with non-zero refcount. This can allow stricter refcount checks in future. Also required for lockless pagecache. Signed-off-by: Nick Piggin <npiggin@suse.de> "This patch also eliminates a leak "cleaned up" by re-clobbering the refcount on every allocation from the hugepage freelists. With respect to the lockless pagecache, the crucial aspect is to eliminate unconditional set_page_count() to 0 on pages with potentially nonzero refcounts, though closer inspection suggests the assignments removed are entirely spurious." Acked-by: William Irwin <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
The bootmem code added to page_alloc.c duplicated some page freeing code that it really doesn't need to because it is not so performance critical. While we're here, make prefetching work properly by actually prefetching the page we're about to use before prefetching ahead to the next one (ie. get the most important transaction started first). Also prefetch just a single page ahead rather than leaving a gap of 16. Jack Steiner reported no problems with SGI's ia64 simulator. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Clarify that preemption needs to be guarded against with the __xxx_page_state functions. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Have an explicit mm call to split higher order pages into individual pages. Should help to avoid bugs and be more explicit about the code's intention. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
- Don't return uninitialised stack values in case of allocation failure - Don't bother clearing PageCompound because __GFP_COMP wasn't specified Increment over the pte page rather than one pte entry in pte_alloc_one_kernel - Actually increment the page pointer in pte_alloc_one - Compile fixes, typos. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-