An error occurred fetching the project authors.
  1. 19 Sep, 2007 1 commit
    • Linus Torvalds's avatar
      x86-64: page faults from user mode are always user faults · dbe3ed1c
      Linus Torvalds authored
      Randy Dunlap noticed an interesting "crashme" behaviour on his dual
      Prescott Xeon setup, where he gets page faults with the error code
      having a zero "user" bit, but the register state points back to user
      mode.
      
      This may be a CPU microcode buglet triggered by some strange instruction
      pattern that crashme generates, and loading a microcode update seems to
      possibly have fixed it.
      
      Regardless, we really should trust the register state more than the
      error code, since it's really the register state that determines whether
      we can actually send a signal, or whether we're in kernel mode and need
      to oops/kill the process in the case of a page fault.
      
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbe3ed1c
  2. 22 Jul, 2007 4 commits
  3. 19 Jul, 2007 1 commit
    • Nick Piggin's avatar
      mm: fault feedback #2 · 83c54070
      Nick Piggin authored
      This patch completes Linus's wish that the fault return codes be made into
      bit flags, which I agree makes everything nicer.  This requires requires
      all handle_mm_fault callers to be modified (possibly the modifications
      should go further and do things like fault accounting in handle_mm_fault --
      however that would be for another patch).
      
      [akpm@linux-foundation.org: fix alpha build]
      [akpm@linux-foundation.org: fix s390 build]
      [akpm@linux-foundation.org: fix sparc build]
      [akpm@linux-foundation.org: fix sparc64 build]
      [akpm@linux-foundation.org: fix ia64 build]
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ian Molton <spyro@f2s.com>
      Cc: Bryan Wu <bryan.wu@analog.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Greg Ungerer <gerg@uclinux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Richard Curnow <rc@rc0.org.uk>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
      Cc: Chris Zankel <chris@zankel.net>
      Acked-by: default avatarKyle McMartin <kyle@mcmartin.ca>
      Acked-by: default avatarHaavard Skinnemoen <hskinnemoen@atmel.com>
      Acked-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Acked-by: default avatarAndi Kleen <ak@muc.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      [ Still apparently needs some ARM and PPC loving - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      83c54070
  4. 08 Jun, 2007 1 commit
    • Steven Rostedt's avatar
      enable interrupts in user path of page fault. · e5e3c84b
      Steven Rostedt authored
      This is a minor fix, but what is currently there is essentially wrong.
      In do_page_fault, if the faulting address from user code happens to be
      in kernel address space (int *p = (int*)-1; p = 0xbed;)  then the
      do_page_fault handler will jump over the local_irq_enable with the
      
        goto bad_area_nosemaphore;
      
      But the first line there sees this is user code and goes through the
      process of sending a signal to send SIGSEGV to the user task. This whole
      time interrupts are disabled and the task can not be preempted by a
      higher priority task.
      
      This patch always enables interrupts in the user path of the
      bad_area_nosemaphore.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e5e3c84b
  5. 08 May, 2007 2 commits
  6. 02 May, 2007 1 commit
  7. 13 Feb, 2007 1 commit
  8. 11 Feb, 2007 1 commit
  9. 07 Dec, 2006 1 commit
  10. 29 Sep, 2006 2 commits
    • Sukadev Bhattiprolu's avatar
      [PATCH] pidspace: is_init() · f400e198
      Sukadev Bhattiprolu authored
      This is an updated version of Eric Biederman's is_init() patch.
      (http://lkml.org/lkml/2006/2/6/280).  It applies cleanly to 2.6.18-rc3 and
      replaces a few more instances of ->pid == 1 with is_init().
      
      Further, is_init() checks pid and thus removes dependency on Eric's other
      patches for now.
      
      Eric's original description:
      
      	There are a lot of places in the kernel where we test for init
      	because we give it special properties.  Most  significantly init
      	must not die.  This results in code all over the kernel test
      	->pid == 1.
      
      	Introduce is_init to capture this case.
      
      	With multiple pid spaces for all of the cases affected we are
      	looking for only the first process on the system, not some other
      	process that has pid == 1.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: <lxc-devel@lists.sourceforge.net>
      Acked-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f400e198
    • Jason Baron's avatar
      [PATCH] make PROT_WRITE imply PROT_READ · df67b3da
      Jason Baron authored
      Make PROT_WRITE imply PROT_READ for a number of architectures which don't
      support write only in hardware.
      
      While looking at this, I noticed that some architectures which do not
      support write only mappings already take the exact same approach.  For
      example, in arch/alpha/mm/fault.c:
      
      "
              if (cause < 0) {
                      if (!(vma->vm_flags & VM_EXEC))
                              goto bad_area;
              } else if (!cause) {
                      /* Allow reads even for write-only mappings */
                      if (!(vma->vm_flags & (VM_READ | VM_WRITE)))
                              goto bad_area;
              } else {
                      if (!(vma->vm_flags & VM_WRITE))
                              goto bad_area;
              }
      "
      
      Thus, this patch brings other architectures which do not support write only
      mappings in-line and consistent with the rest.  I've verified the patch on
      ia64, x86_64 and x86.
      
      Additional discussion:
      
      Several architectures, including x86, can not support write-only mappings.
      The pte for x86 reserves a single bit for protection and its two states are
      read only or read/write.  Thus, write only is not supported in h/w.
      
      Currently, if i 'mmap' a page write-only, the first read attempt on that page
      creates a page fault and will SEGV.  That check is enforced in
      arch/blah/mm/fault.c.  However, if i first write that page it will fault in
      and the pte will be set to read/write.  Thus, any subsequent reads to the page
      will succeed.  It is this inconsistency in behavior that this patch is
      attempting to address.  Furthermore, if the page is swapped out, and then
      brought back the first read will also cause a SEGV.  Thus, any arbitrary read
      on a page can potentially result in a SEGV.
      
      According to the SuSv3 spec, "if the application requests only PROT_WRITE, the
      implementation may also allow read access." Also as mentioned, some
      archtectures, such as alpha, shown above already take the approach that i am
      suggesting.
      
      The counter-argument to this raised by Arjan, is that the kernel is enforcing
      the write only mapping the best it can given the h/w limitations.  This is
      true, however Alan Cox, and myself would argue that the inconsitency in
      behavior, that is applications can sometimes work/sometimes fails is highly
      undesireable.  If you read through the thread, i think people, came to an
      agreement on the last patch i posted, as nobody has objected to it...
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: default avatarAndi Kleen <ak@muc.de>
      Acked-by: default avatarAlan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
      Cc: Ian Molton <spyro@f2s.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      df67b3da
  11. 26 Sep, 2006 3 commits
    • Dave McCracken's avatar
      [PATCH] Standardize pxx_page macros · 46a82b2d
      Dave McCracken authored
      One of the changes necessary for shared page tables is to standardize the
      pxx_page macros.  pte_page and pmd_page have always returned the struct
      page associated with their entry, while pte_page_kernel and pmd_page_kernel
      have returned the kernel virtual address.  pud_page and pgd_page, on the
      other hand, return the kernel virtual address.
      
      Shared page tables needs pud_page and pgd_page to return the actual page
      structures.  There are very few actual users of these functions, so it is
      simple to standardize their usage.
      
      Since this is basic cleanup, I am submitting these changes as a standalone
      patch.  Per Hugh Dickins' comments about it, I am also changing the
      pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.
      Signed-off-by: default avatarDave McCracken <dmccr@us.ibm.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      46a82b2d
    • Andi Kleen's avatar
      [PATCH] make fault notifier unconditional and export it · 273819a2
      Andi Kleen authored
      It's needed for external debuggers and overhead is very small.
      
      Also make the actual notifier chain they use static
      
      Cc: jbeulich@novell.com
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      273819a2
    • Andi Kleen's avatar
      [PATCH] Add sparse annotations to quiet sparse in arch/x86_64/mm/fault.c · dd2994f6
      Andi Kleen authored
      Fixes
      
      linux/arch/x86_64/mm/fault.c:125:7: warning: incorrect type in argument 1 (different address spaces)
      linux/arch/x86_64/mm/fault.c:125:7:    expected void [noderef] *<noident><asn:1>
      linux/arch/x86_64/mm/fault.c:125:7:    got unsigned char *[assigned] instr
      linux/arch/x86_64/mm/fault.c:163:8: warning: incorrect type in argument 1 (different address spaces)
      linux/arch/x86_64/mm/fault.c:163:8:    expected void [noderef] *<noident><asn:1>
      linux/arch/x86_64/mm/fault.c:163:8:    got unsigned char *[assigned] instr
      linux/arch/x86_64/mm/fault.c:179:9: warning: incorrect type in argument 1 (different address spaces)
      linux/arch/x86_64/mm/fault.c:179:9:    expected void [noderef] *<noident><asn:1>
      linux/arch/x86_64/mm/fault.c:179:9:    got unsigned long *<noident>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      dd2994f6
  12. 03 Jul, 2006 1 commit
  13. 30 Jun, 2006 2 commits
  14. 26 Jun, 2006 3 commits
    • Chuck Ebbert's avatar
      [PATCH] x86_64: enlarge window for stack growth · 03fdc2c2
      Chuck Ebbert authored
      Allow stack growth so the 'enter' instruction works.  Also
      fixes problem in compat_sys_kexec_load() which could allocate
      more than 128 bytes using compat_alloc_user_space().
      Signed-off-by: default avatarChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      03fdc2c2
    • Andi Kleen's avatar
      [PATCH] x86_64: Get rid of pud_offset_k / __pud_offset_k · d2ae5b5f
      Andi Kleen authored
      pud_offset_k() equivalent to pud_offset() now.  Pointed out by Jan Beulich
      Similar for __pud_offset_ok, which needs a small change in the callers.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d2ae5b5f
    • Anil S Keshavamurthy's avatar
      [PATCH] Notify page fault call chain for x86_64 · 1bd858a5
      Anil S Keshavamurthy authored
      Currently in the do_page_fault() code path, we call notify_die(DIE_PAGE_FAULT,
      ...) to notify the page fault.  Since notify_die() is highly overloaded, this
      page fault notification is currently being sent to all the components
      registered with register_die_notification() which uses the same die_chain to
      loop for all the registered components which is unnecessary.
      
      In order to optimize the do_page_fault() code path, this critical page fault
      notification is now moved to different call chain and the test results showed
      great improvements.
      
      And the kprobes which is interested in this notifications, now registers onto
      this new call chain only when it need to, i.e Kprobes now registers for page
      fault notification only when their are an active probes and unregisters from
      this page fault notification when no probes are active.
      
      I have incorporated all the feedback given by Ananth and Keith and everyone,
      and thanks for all the review feedback.
      
      This patch:
      
      Overloading of page fault notification with the notify_die() has performance
      issues(since the only interested components for page fault is kprobes and/or
      kdb) and hence this patch introduces the new notifier call chain exclusively
      for page fault notifications their by avoiding notifying unnecessary
      components in the do_page_fault() code path.
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      1bd858a5
  15. 31 Mar, 2006 1 commit
    • OGAWA Hirofumi's avatar
      [PATCH] Don't pass boot parameters to argv_init[] · 9b41046c
      OGAWA Hirofumi authored
      The boot cmdline is parsed in parse_early_param() and
      parse_args(,unknown_bootoption).
      
      And __setup() is used in obsolete_checksetup().
      
      	start_kernel()
      		-> parse_args()
      			-> unknown_bootoption()
      				-> obsolete_checksetup()
      
      If __setup()'s callback (->setup_func()) returns 1 in
      obsolete_checksetup(), obsolete_checksetup() thinks a parameter was
      handled.
      
      If ->setup_func() returns 0, obsolete_checksetup() tries other
      ->setup_func().  If all ->setup_func() that matched a parameter returns 0,
      a parameter is seted to argv_init[].
      
      Then, when runing /sbin/init or init=app, argv_init[] is passed to the app.
      If the app doesn't ignore those arguments, it will warning and exit.
      
      This patch fixes a wrong usage of it, however fixes obvious one only.
      Signed-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9b41046c
  16. 25 Mar, 2006 2 commits
    • Arjan van de Ven's avatar
      [PATCH] x86_64: prefetch the mmap_sem in the fault path · a9ba9a3b
      Arjan van de Ven authored
      In a micro-benchmark that stresses the pagefault path, the down_read_trylock
      on the mmap_sem showed up quite high on the profile. Turns out this lock is
      bouncing between cpus quite a bit and thus is cache-cold a lot. This patch
      prefetches the lock (for write) as early as possible (and before some other
      somewhat expensive operations). With this patch, the down_read_trylock
      basically fell out of the top of profile.
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a9ba9a3b
    • Jan Beulich's avatar
      [PATCH] x86_64: actively synchronize vmalloc area when registering certain callbacks · 8c914cb7
      Jan Beulich authored
      While the modular aspect of the respective i386 patch doesn't apply to
      x86-64 (as the top level page directory entry is shared between modules
      and the base kernel), handlers registered with register_die_notifier()
      are still under similar constraints for touching ioremap()ed or
      vmalloc()ed memory. The likelihood of this problem becoming visible is
      of course significantly lower, as the assigned virtual addresses would
      have to cross a 2**39 byte boundary. This is because the callback gets
      invoked
      (a) in the page fault path before the top level page table propagation
      gets carried out (hence a fault to propagate the top level page table
      entry/entries mapping to module's code/data would nest infinitly) and
      (b) in the NMI path, where nested faults must absolutely not happen,
      since otherwise the IRET from the nested fault re-enables NMIs,
      potentially resulting in nested NMI occurences.
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8c914cb7
  17. 05 Feb, 2006 1 commit
  18. 12 Jan, 2006 4 commits
  19. 15 Nov, 2005 1 commit
    • Andi Kleen's avatar
      [PATCH] x86_64: Remove CONFIG_CHECKING and add command line option for pagefault tracing · 9e43e1b7
      Andi Kleen authored
      CONFIG_CHECKING covered some debugging code used in the early times
      of the port. But it wasn't even SMP safe for quite some time
      and the bugs it checked for seem to be gone.
      
      This patch removes all the code to verify GS at kernel entry. There
      haven't been any new bugs in this area for a long time.
      
      Previously it also covered the sysctl for the page fault tracing.
      That didn't make much sense because that code was unconditionally
      compiled in. I made that a boot option now because it is typically
      only useful at boot.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9e43e1b7
  20. 12 Sep, 2005 1 commit
  21. 07 Sep, 2005 1 commit
  22. 20 Aug, 2005 1 commit
  23. 04 Aug, 2005 1 commit
  24. 29 Jul, 2005 1 commit
  25. 23 Jun, 2005 1 commit
  26. 22 Jun, 2005 1 commit
    • Suresh Siddha's avatar
      [PATCH] x86_64: TASK_SIZE fixes for compatibility mode processes · 84929801
      Suresh Siddha authored
      Appended patch will setup compatibility mode TASK_SIZE properly.  This will
      fix atleast three known bugs that can be encountered while running
      compatibility mode apps.
      
      a) A malicious 32bit app can have an elf section at 0xffffe000.  During
         exec of this app, we will have a memory leak as insert_vm_struct() is
         not checking for return value in syscall32_setup_pages() and thus not
         freeing the vma allocated for the vsyscall page.  And instead of exec
         failing (as it has addresses > TASK_SIZE), we were allowing it to
         succeed previously.
      
      b) With a 32bit app, hugetlb_get_unmapped_area/arch_get_unmapped_area
         may return addresses beyond 32bits, ultimately causing corruption
         because of wrap-around and resulting in SEGFAULT, instead of returning
         ENOMEM.
      
      c) 32bit app doing this below mmap will now fail.
      
        mmap((void *)(0xFFFFE000UL), 0x10000UL, PROT_READ|PROT_WRITE,
      	MAP_FIXED|MAP_PRIVATE|MAP_ANON, 0, 0);
      Signed-off-by: default avatarZou Nan hai <nanhai.zou@intel.com>
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      84929801