1. 15 Sep, 2008 10 commits
    • Christoph Hellwig's avatar
      powerpc: Use sys_pause for 32-bit pause entry point · d6c93adb
      Christoph Hellwig authored
      sys32_pause is a useless copy of the generic sys_pause.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      d6c93adb
    • Paul Mackerras's avatar
      powerpc: Make the 64-bit kernel as a position-independent executable · 549e8152
      Paul Mackerras authored
      This implements CONFIG_RELOCATABLE for 64-bit by making the kernel as
      a position-independent executable (PIE) when it is set.  This involves
      processing the dynamic relocations in the image in the early stages of
      booting, even if the kernel is being run at the address it is linked at,
      since the linker does not necessarily fill in words in the image for
      which there are dynamic relocations.  (In fact the linker does fill in
      such words for 64-bit executables, though not for 32-bit executables,
      so in principle we could avoid calling relocate() entirely when we're
      running a 64-bit kernel at the linked address.)
      
      The dynamic relocations are processed by a new function relocate(addr),
      where the addr parameter is the virtual address where the image will be
      run.  In fact we call it twice; once before calling prom_init, and again
      when starting the main kernel.  This means that reloc_offset() returns
      0 in prom_init (since it has been relocated to the address it is running
      at), which necessitated a few adjustments.
      
      This also changes __va and __pa to use an equivalent definition that is
      simpler.  With the relocatable kernel, PAGE_OFFSET and MEMORY_START are
      constants (for 64-bit) whereas PHYSICAL_START is a variable (and
      KERNELBASE ideally should be too, but isn't yet).
      
      With this, relocatable kernels still copy themselves down to physical
      address 0 and run there.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      549e8152
    • Paul Mackerras's avatar
      powerpc: Use LOAD_REG_IMMEDIATE only for constants on 64-bit · e31aa453
      Paul Mackerras authored
      Using LOAD_REG_IMMEDIATE to get the address of kernel symbols
      generates 5 instructions where LOAD_REG_ADDR can do it in one,
      and will generate R_PPC64_ADDR16_* relocations in the output when
      we get to making the kernel as a position-independent executable,
      which we'd rather not have to handle.  This changes various bits
      of assembly code to use LOAD_REG_ADDR when we need to get the
      address of a symbol, or to use suitable position-independent code
      for cases where we can't access the TOC for various reasons, or
      if we're not running at the address we were linked at.
      
      It also cleans up a few minor things; there's no reason to save and
      restore SRR0/1 around RTAS calls, __mmu_off can get the return
      address from LR more conveniently than the caller can supply it in
      R4 (and we already assume elsewhere that EA == RA if the MMU is on
      in early boot), and enable_64b_mode was using 5 instructions where
      2 would do.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      e31aa453
    • Paul Mackerras's avatar
      powerpc: Make it possible to move the interrupt handlers away from the kernel · 1f6a93e4
      Paul Mackerras authored
      This changes the way that the exception prologs transfer control to
      the handlers in 64-bit kernels with the aim of making it possible to
      have the prologs separate from the main body of the kernel.  Now,
      instead of computing the address of the handler by taking the top
      32 bits of the paca address (to get the 0xc0000000........ part) and
      ORing in something in the bottom 16 bits, we get the base address of
      the kernel by doing a load from the paca and add an offset.
      
      This also replaces an mfmsr and an ori to compute the MSR value for
      the handler with a load from the paca.  That makes it unnecessary to
      have a separate version of EXCEPTION_PROLOG_PSERIES that forces 64-bit
      mode.
      
      We can no longer use a direct branches in the exception prolog code,
      which means that the SLB miss handlers can't branch directly to
      .slb_miss_realmode any more.  Instead we have to compute the address
      and do an indirect branch.  This is conditional on CONFIG_RELOCATABLE;
      for non-relocatable kernels we use a direct branch as before.  (A later
      change will allow CONFIG_RELOCATABLE to be set on 64-bit powerpc.)
      
      Since the secondary CPUs on pSeries start execution in the first 0x100
      bytes of real memory and then have to get to wherever the kernel is,
      we can't use a direct branch to get there.  Instead this changes
      __secondary_hold_spinloop from a flag to a function pointer.  When it
      is set to a non-NULL value, the secondary CPUs jump to the function
      pointed to by that value.
      
      Finally this eliminates one code difference between 32-bit and 64-bit
      by making __secondary_hold be the text address of the secondary CPU
      spinloop rather than a function descriptor for it.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      1f6a93e4
    • Paul Mackerras's avatar
      powerpc: Rearrange head_64.S to move interrupt handler code to the beginning · 9a955167
      Paul Mackerras authored
      This rearranges head_64.S so that we have all the first-level exception
      prologs together starting at 0x100, followed by all the second-level
      handlers that are invoked from the first-level prologs, followed by
      other code.  This doesn't make any functional change but will make
      following changes for relocatable kernel support easier.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      9a955167
    • Chandru's avatar
      powerpc: Add support for dynamic reconfiguration memory in kexec/kdump kernels · cf00085d
      Chandru authored
      Kdump kernel needs to use only those memory regions that it is allowed
      to use (crashkernel, rtas, tce, etc.).  Each of these regions have
      their own sizes and are currently added under 'linux,usable-memory'
      property under each memory@xxx node of the device tree.
      
      The ibm,dynamic-memory property of ibm,dynamic-reconfiguration-memory
      node (on POWER6) now stores in it the representation for most of the
      logical memory blocks with the size of each memory block being a
      constant (lmb_size).  If one or more or part of the above mentioned
      regions lie under one of the lmb from ibm,dynamic-memory property,
      there is a need to identify those regions within the given lmb.
      
      This makes the kernel recognize a new 'linux,drconf-usable-memory'
      property added by kexec-tools.  Each entry in this property is of the
      form of a count followed by that many (base, size) pairs for the above
      mentioned regions.  The number of cells in the count value is given by
      the #size-cells property of the root node.
      Signed-off-by: default avatarChandru Siddalingappa <chandru@in.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      cf00085d
    • Nathan Fontenot's avatar
      powerpc: Check rc of notifier chain for memory remove · 525c411d
      Nathan Fontenot authored
      The return code from invocation of the notifier for
      pSeries_reconfig_chain during update of the device tree is not
      checked.  This causes writes to /proc/ppc64/ofdt to update memory
      properties (i.e. ibm,dyamic-reconfiguration-memory) to always
      return success, instead of the result of the notifier chain.
      
      This happens specifically when we remove/add memory from the
      device tree on machines using memory specified in the
      ibm,dynamic-reconfiguration-memory property of the device tree.
      Signed-off-by: default avatarNathan Fontenot <nfont@austin.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      525c411d
    • Mark Nelson's avatar
      powerpc: New copy_4K_page() · 57dda6ef
      Mark Nelson authored
      This new copy_4K_page() function was originally tuned for the best
      performance on the Cell processor, but after testing on more 64bit
      powerpc chips it was found that with a small modification it either
      matched the performance offered by the current mainline version or
      bettered it by a small amount.
      
      It was found that on a Cell-based QS22 blade the amount of system
      time measured when compiling a 2.6.26 pseries_defconfig decreased
      by 4%. Using the same test, a 4-way 970MP machine saw a decrease of
      2% in system time. No noticeable change was seen on Power4, Power5
      or Power6.
      
      The 4096 byte page is copied in thirty-two 128 byte strides. An
      initial setup loop executes dcbt instructions for the whole source
      page and dcbz instructions for the whole destination page. To do
      this, the cache line size is retrieved from ppc64_caches.
      
      A new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, (introduced in the
      previous patch) is used to make the modification to this new copy
      routine - on Power4, 970 and Cell the feature bit is set so the
      setup loop is executed, but on all other 64bit chips the setup
      loop is nop'ed out.
      Signed-off-by: default avatarMark Nelson <markn@au1.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      57dda6ef
    • Mark Nelson's avatar
      powerpc: Add new CPU feature: CPU_FTR_CP_USE_DCBTZ · 2a929436
      Mark Nelson authored
      Add a new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, to be added to the
      64bit powerpc chips that benefit from having dcbt and dcbz
      instructions used in their memory copy routines.
      
      This will be used in a subsequent patch that updates copy_4K_page().
      The new bit is added to Cell, PPC970 and Power4 because they show
      better performance with the new copy_4K_page() when dcbt and dcbz
      instructions are used.
      Signed-off-by: default avatarMark Nelson <markn@au1.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      2a929436
    • roel kluin's avatar
      powerpc: Fix duplicate test of MACIO_FLAG_SCCB_ON · 1b3c83e6
      roel kluin authored
      Evidently MACIO_FLAG_SCCA_ON was meant.
      Signed-off-by: default avatarRoel Kluin <roel.kluin@gmail.com>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      1b3c83e6
  2. 10 Sep, 2008 1 commit
  3. 09 Sep, 2008 12 commits
  4. 08 Sep, 2008 17 commits