Commit 7d24f0b8 authored by David Gibson's avatar David Gibson Committed by Linus Torvalds

[PATCH] ppc64: Fix bug in SLB miss handler for hugepages

This patch, however, should be applied on top of the 64k-page-size patch to
fix some problems with hugepage (some pre-existing, another introduced by
this patch).

The patch fixes a bug in the SLB miss handler for hugepages on ppc64
introduced by the dynamic hugepage patch (commit id
c594adad) due to a misunderstanding of the
srd instruction's behaviour (mea culpa).  The problem arises when a 64-bit
process maps some hugepages in the low 4GB of the address space (unusual).
In this case, as well as the 256M segment in question being marked for
hugepages, other segments at 32G intervals will be incorrectly marked for
hugepages.

In the process, this patch tweaks the semantics of the hugepage bitmaps to
be more sensible.  Previously, an address below 4G was marked for hugepages
if the appropriate segment bit in the "low areas" bitmask was set *or* if
the low bit in the "high areas" bitmap was set (which would mark all
addresses below 1TB for hugepage).  With this patch, any given address is
governed by a single bitmap.  Addresses below 4GB are marked for hugepage
if and only if their bit is set in the "low areas" bitmap (256M
granularity).  Addresses between 4GB and 1TB are marked for hugepage iff
the low bit in the "high areas" bitmap is set.  Higher addresses are marked
for hugepage iff their bit in the "high areas" bitmap is set (1TB
granularity).

To avoid conflicts, this patch must be applied on top of BenH's pending
patch for 64k base page size [0].  As such, this patch also addresses a
hugepage problem introduced by that patch.  That patch allows hugepages of
1MB in size on hardware which supports it, however, that won't work when
using 4k pages (4 level pagetable), because in that case hugepage PTEs are
stored at the PMD level, and each PMD entry maps 2MB.  This patch simply
disallows hugepages in that case (we can do something cleverer to re-enable
them some other day).

Built, booted, and a handful of hugepage related tests passed on POWER5
LPAR (both ARCH=powerpc and ARCH=ppc64).

[0] http://gate.crashing.org/~benh/ppc64-64k-pages.diffSigned-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 0b154bb7
...@@ -329,12 +329,14 @@ static void __init htab_init_page_sizes(void) ...@@ -329,12 +329,14 @@ static void __init htab_init_page_sizes(void)
*/ */
if (mmu_psize_defs[MMU_PAGE_16M].shift) if (mmu_psize_defs[MMU_PAGE_16M].shift)
mmu_huge_psize = MMU_PAGE_16M; mmu_huge_psize = MMU_PAGE_16M;
/* With 4k/4level pagetables, we can't (for now) cope with a
* huge page size < PMD_SIZE */
else if (mmu_psize_defs[MMU_PAGE_1M].shift) else if (mmu_psize_defs[MMU_PAGE_1M].shift)
mmu_huge_psize = MMU_PAGE_1M; mmu_huge_psize = MMU_PAGE_1M;
/* Calculate HPAGE_SHIFT and sanity check it */ /* Calculate HPAGE_SHIFT and sanity check it */
if (mmu_psize_defs[mmu_huge_psize].shift > 16 && if (mmu_psize_defs[mmu_huge_psize].shift > MIN_HUGEPTE_SHIFT &&
mmu_psize_defs[mmu_huge_psize].shift < 28) mmu_psize_defs[mmu_huge_psize].shift < SID_SHIFT)
HPAGE_SHIFT = mmu_psize_defs[mmu_huge_psize].shift; HPAGE_SHIFT = mmu_psize_defs[mmu_huge_psize].shift;
else else
HPAGE_SHIFT = 0; /* No huge pages dude ! */ HPAGE_SHIFT = 0; /* No huge pages dude ! */
......
...@@ -212,6 +212,12 @@ static int prepare_high_area_for_htlb(struct mm_struct *mm, unsigned long area) ...@@ -212,6 +212,12 @@ static int prepare_high_area_for_htlb(struct mm_struct *mm, unsigned long area)
BUG_ON(area >= NUM_HIGH_AREAS); BUG_ON(area >= NUM_HIGH_AREAS);
/* Hack, so that each addresses is controlled by exactly one
* of the high or low area bitmaps, the first high area starts
* at 4GB, not 0 */
if (start == 0)
start = 0x100000000UL;
/* Check no VMAs are in the region */ /* Check no VMAs are in the region */
vma = find_vma(mm, start); vma = find_vma(mm, start);
if (vma && (vma->vm_start < end)) if (vma && (vma->vm_start < end))
......
...@@ -80,12 +80,17 @@ _GLOBAL(slb_miss_kernel_load_virtual) ...@@ -80,12 +80,17 @@ _GLOBAL(slb_miss_kernel_load_virtual)
BEGIN_FTR_SECTION BEGIN_FTR_SECTION
b 1f b 1f
END_FTR_SECTION_IFCLR(CPU_FTR_16M_PAGE) END_FTR_SECTION_IFCLR(CPU_FTR_16M_PAGE)
cmpldi r10,16
lhz r9,PACALOWHTLBAREAS(r13)
mr r11,r10
blt 5f
lhz r9,PACAHIGHHTLBAREAS(r13) lhz r9,PACAHIGHHTLBAREAS(r13)
srdi r11,r10,(HTLB_AREA_SHIFT-SID_SHIFT) srdi r11,r10,(HTLB_AREA_SHIFT-SID_SHIFT)
srd r9,r9,r11
lhz r11,PACALOWHTLBAREAS(r13) 5: srd r9,r9,r11
srd r11,r11,r10 andi. r9,r9,1
or. r9,r9,r11
beq 1f beq 1f
_GLOBAL(slb_miss_user_load_huge) _GLOBAL(slb_miss_user_load_huge)
li r11,0 li r11,0
......
...@@ -23,6 +23,9 @@ ...@@ -23,6 +23,9 @@
#define PMD_SIZE (1UL << PMD_SHIFT) #define PMD_SIZE (1UL << PMD_SHIFT)
#define PMD_MASK (~(PMD_SIZE-1)) #define PMD_MASK (~(PMD_SIZE-1))
/* With 4k base page size, hugepage PTEs go at the PMD level */
#define MIN_HUGEPTE_SHIFT PMD_SHIFT
/* PUD_SHIFT determines what a third-level page table entry can map */ /* PUD_SHIFT determines what a third-level page table entry can map */
#define PUD_SHIFT (PMD_SHIFT + PMD_INDEX_SIZE) #define PUD_SHIFT (PMD_SHIFT + PMD_INDEX_SIZE)
#define PUD_SIZE (1UL << PUD_SHIFT) #define PUD_SIZE (1UL << PUD_SHIFT)
......
...@@ -14,6 +14,9 @@ ...@@ -14,6 +14,9 @@
#define PTRS_PER_PMD (1 << PMD_INDEX_SIZE) #define PTRS_PER_PMD (1 << PMD_INDEX_SIZE)
#define PTRS_PER_PGD (1 << PGD_INDEX_SIZE) #define PTRS_PER_PGD (1 << PGD_INDEX_SIZE)
/* With 4k base page size, hugepage PTEs go at the PMD level */
#define MIN_HUGEPTE_SHIFT PAGE_SHIFT
/* PMD_SHIFT determines what a second-level page table entry can map */ /* PMD_SHIFT determines what a second-level page table entry can map */
#define PMD_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE) #define PMD_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE)
#define PMD_SIZE (1UL << PMD_SHIFT) #define PMD_SIZE (1UL << PMD_SHIFT)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment