Commit 3b98b087 authored by Nishanth Aravamudan's avatar Nishanth Aravamudan Committed by Linus Torvalds

[PATCH] fix NUMA interleaving for huge pages

Since vma->vm_pgoff is in units of smallpages, VMAs for huge pages have the
lower HPAGE_SHIFT - PAGE_SHIFT bits always cleared, which results in badd
offsets to the interleave functions.  Take this difference from small pages
into account when calculating the offset.  This does add a 0-bit shift into
the small-page path (via alloc_page_vma()), but I think that is negligible.
 Also add a BUG_ON to prevent the offset from growing due to a negative
right-shift, which probably shouldn't be allowed anyways.

Tested on an 8-memory node ppc64 NUMA box and got the interleaving I
expected.
Signed-off-by: default avatarNishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: default avatarAdam Litke <agl@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: default avatarChristoph Lameter <clameter@engr.sgi.com>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 1678df37
...@@ -1176,7 +1176,15 @@ static inline unsigned interleave_nid(struct mempolicy *pol, ...@@ -1176,7 +1176,15 @@ static inline unsigned interleave_nid(struct mempolicy *pol,
if (vma) { if (vma) {
unsigned long off; unsigned long off;
off = vma->vm_pgoff; /*
* for small pages, there is no difference between
* shift and PAGE_SHIFT, so the bit-shift is safe.
* for huge pages, since vm_pgoff is in units of small
* pages, we need to shift off the always 0 bits to get
* a useful offset.
*/
BUG_ON(shift < PAGE_SHIFT);
off = vma->vm_pgoff >> (shift - PAGE_SHIFT);
off += (addr - vma->vm_start) >> shift; off += (addr - vma->vm_start) >> shift;
return offset_il_node(pol, vma, off); return offset_il_node(pol, vma, off);
} else } else
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment