1. 25 Nov, 2008 1 commit
  2. 06 Jan, 2009 2 commits
  3. 22 Nov, 2008 1 commit
  4. 05 Nov, 2008 2 commits
  5. 06 Jan, 2009 1 commit
  6. 04 Nov, 2008 1 commit
  7. 04 Jan, 2009 1 commit
  8. 17 Dec, 2008 1 commit
  9. 26 Nov, 2008 1 commit
    • Josef Bacik's avatar
      jbd2: improve jbd2 fsync batching · e07f7183
      Josef Bacik authored
      This patch removes the static sleep time in favor of a more self
      optimizing approach where we measure the average amount of time it
      takes to commit a transaction to disk and the ammount of time a
      transaction has been running.  If somebody does a sync write or an
      fsync() traditionally we would sleep for 1 jiffies, which depending on
      the value of HZ could be a significant amount of time compared to how
      long it takes to commit a transaction to the underlying storage.  With
      this patch instead of sleeping for a jiffie, we check to see if the
      amount of time this transaction has been running is less than the
      average commit time, and if it is we sleep for the delta using
      schedule_hrtimeout to give us a higher precision sleep time.  This
      greatly benefits high end storage where you could end up sleeping for
      longer than it takes to commit the transaction and therefore sitting
      idle instead of allowing the transaction to be committed by keeping
      the sleep time to a minimum so you are sure to always be doing
      something.
      Signed-off-by: default avatarJosef Bacik <jbacik@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      e07f7183
  10. 06 Jan, 2009 3 commits
    • Aneesh Kumar K.V's avatar
      ext4: Don't overwrite allocation_context ac_status · 032115fc
      Aneesh Kumar K.V authored
      We can call ext4_mb_check_limits even after successfully allocating
      the requested blocks.  In that case, make sure we don't overwrite
      ac_status if it already has the status AC_STATUS_FOUND.  This fixes
      the lockdep warning:
      
      =============================================
      [ INFO: possible recursive locking detected ]
      2.6.28-rc6-autokern1 #1
      ---------------------------------------------
      fsstress/11948 is trying to acquire lock:
       (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
      .....
      
      stack backtrace:
      .....
       [<c04db974>] ext4_mb_regular_allocator+0xbb5/0xd44
      .....
      
      but task is already holding lock:
       (&meta_group_info[i]->alloc_sem){----}, at: [<c04d9a49>] ext4_mb_load_buddy+0x9f/0x278
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      032115fc
    • Theodore Ts'o's avatar
      ext4: remove extraneous newlines from calls to ext4_error() and ext4_warning() · fde4d95a
      Theodore Ts'o authored
      This removes annoying blank syslog entries emitted by ext4_error() or
      ext4_warning(), since these functions add their own newline.
      Signed-off-by: default avatarNick Warne <nick@ukfsn.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      fde4d95a
    • Theodore Ts'o's avatar
      jbd2: Add barrier not supported test to journal_wait_on_commit_record · fd98496f
      Theodore Ts'o authored
      Xen doesn't report that barriers are not supported until buffer I/O is
      reported as completed, instead of when the buffer I/O is submitted.
      Add a check and a fallback codepath to journal_wait_on_commit_record()
      to detect this case, so that attempts to mount ext4 filesystems on
      LVM/devicemapper devices on Xen guests don't blow up with an "Aborting
      journal on device XXX"; "Remounting filesystem read-only" error.
      
      Thanks to Andreas Sundstrom for reporting this issue.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      fd98496f
  11. 07 Jan, 2009 1 commit
    • Frank Mayhar's avatar
      ext4: Allow ext4 to run without a journal · 0390131b
      Frank Mayhar authored
      A few weeks ago I posted a patch for discussion that allowed ext4 to run
      without a journal.  Since that time I've integrated the excellent
      comments from Andreas and fixed several serious bugs.  We're currently
      running with this patch and generating some performance numbers against
      both ext2 (with backported reservations code) and ext4 with and without
      a journal.  It just so happens that running without a journal is
      slightly faster for most everything.
      
      We did
      	iozone -T -t 4 s 2g -r 256k -T -I -i0 -i1 -i2
      
      which creates 4 threads, each of which create and do reads and writes on
      a 2G file, with a buffer size of 256K, using O_DIRECT for all file opens
      to bypass the page cache.  Results:
      
                           ext2        ext4, default   ext4, no journal
        initial writes   13.0 MB/s        15.4 MB/s          15.7 MB/s
        rewrites         13.1 MB/s        15.6 MB/s          15.9 MB/s
        reads            15.2 MB/s        16.9 MB/s          17.2 MB/s
        re-reads         15.3 MB/s        16.9 MB/s          17.2 MB/s
        random readers    5.6 MB/s         5.6 MB/s           5.7 MB/s
        random writers    5.1 MB/s         5.3 MB/s           5.4 MB/s 
      
      So it seems that, so far, this was a useful exercise.
      Signed-off-by: default avatarFrank Mayhar <fmayhar@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      0390131b
  12. 17 Dec, 2008 1 commit
  13. 27 Nov, 2008 1 commit
  14. 26 Nov, 2008 1 commit
  15. 25 Nov, 2008 2 commits
  16. 06 Jan, 2009 2 commits
  17. 05 Nov, 2008 1 commit
    • Theodore Ts'o's avatar
      ext4: tone down ext4_da_writepages warnings · 2a21e37e
      Theodore Ts'o authored
      If the filesystem has errors, ext4_da_writepages() will return a *lot*
      of errors, including lots and lots of stack dumps.  While it's true
      that we are dropping user data on the floor, which is unfortunate, the
      stack dumps aren't helpful, and they tend to obscure the true original
      root cause of the problem.  So in the case where the filesystem has
      aborted, return an EROFS right away.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      2a21e37e
  18. 12 Dec, 2008 1 commit
    • Theodore Ts'o's avatar
      ext4: remove do_blk_alloc() · 97df5d15
      Theodore Ts'o authored
      The convenience function do_blk_alloc() is a static function with only
      one caller, so fold it into ext4_new_meta_blocks() to simplify the
      code and to make it easier to understand.
      
      To save more stack space, if count is a null pointer in
      ext4_new_meta_blocks() assume that caller wanted a single block (and
      if there is an error, no blocks were allocated).
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      97df5d15
  19. 07 Dec, 2008 1 commit
    • Theodore Ts'o's avatar
      ext4: remove ext4_new_meta_block() · cfe82c85
      Theodore Ts'o authored
      There were only two one callers of the function ext4_new_meta_block(),
      which just a very simpler wrapper function around
      ext4_new_meta_blocks().  Change those two functions to call
      ext4_new_meta_blocks() directly, to save code and stack space usage.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      cfe82c85
  20. 02 Jan, 2009 1 commit
  21. 06 Jan, 2009 1 commit
  22. 06 Dec, 2008 1 commit
  23. 28 Oct, 2008 2 commits
  24. 29 Oct, 2008 1 commit
  25. 05 Jan, 2009 1 commit
  26. 04 Jan, 2009 8 commits
    • Alessandro Zummo's avatar
      rtc: add alarm/update irq interfaces · 099e6576
      Alessandro Zummo authored
      Add standard interfaces for alarm/update irqs enabling.  Drivers are no
      more required to implement equivalent ioctl code as rtc-dev will provide
      it.
      
      UIE emulation should now be handled correctly and will work even for those
      RTC drivers who cannot be configured to do both UIE and AIE.
      Signed-off-by: default avatarAlessandro Zummo <a.zummo@towertech.it>
      Cc: David Brownell <david-b@pacbell.net>
      Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      099e6576
    • Nick Piggin's avatar
      fs: symlink write_begin allocation context fix · 54566b2c
      Nick Piggin authored
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
    • Bruno Prémont's avatar
      viafb: fix crashes due to 4k stack overflow · e687d691
      Bruno Prémont authored
      The function viafb_cursor() uses 2 stack-variables of CURSOR_SIZE bits;
      CURSOR_SIZE is defined as (8 * 1024).  Using up twice 1k on stack is too
      much for 4k-stack (though it works with 8k-stacks).  Make those two
      variables kzalloc'ed to preserve stack space.
      
      Also merge the whole lot of local struct's in viafb_ioctl into a union so
      the stack usage gets minimized here as well.  (struct's are only accessed
      in their indicidual IOCTL case) This second part is only compile-tested as
      I know of no userspace app using the IOCTLs.
      Signed-off-by: default avatarBruno Prémont <bonbons@linux-vserver.org>
      Cc: <JosephChan@via.com.tw>
      Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e687d691
    • Pekka Enberg's avatar
      fs: introduce bgl_lock_ptr() · c644f0e4
      Pekka Enberg authored
      As suggested by Andreas Dilger, introduce a bgl_lock_ptr() helper in
      <linux/blockgroup_lock.h> and add separate sb_bgl_lock() helpers to
      filesystem specific header files to break the hidden dependency to
      struct ext[234]_sb_info.
      
      Also, while at it, convert the macros to static inlines to try make up
      for all the times I broke Andrew Morton's tree.
      Acked-by: default avatarAndreas Dilger <adilger@sun.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c644f0e4
    • Randy Dunlap's avatar
      spi.h uses/needs device.h · 0a30c5ce
      Randy Dunlap authored
      Include header files as used/needed:
      
        In file included from drivers/leds/leds-dac124s085.c:16:
        include/linux/spi/spi.h:66: error: field 'dev' has incomplete type
        include/linux/spi/spi.h: In function 'to_spi_device':
        include/linux/spi/spi.h:100: warning: type defaults to 'int' in declaration of '__mptr'
        ...
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Cc: David Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0a30c5ce
    • Adam Lackorzynski's avatar
      vmalloc.c: fix flushing in vmap_page_range() · 2e4e27c7
      Adam Lackorzynski authored
      The flush_cache_vmap in vmap_page_range() is called with the end of the
      range twice.  The following patch fixes this for me.
      Signed-off-by: default avatarAdam Lackorzynski <adam@os.inf.tu-dresden.de>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e4e27c7
    • Li Zefan's avatar
      cgroups: fix a race between cgroup_clone and umount · 7b574b7b
      Li Zefan authored
      The race is calling cgroup_clone() while umounting the ns cgroup subsys,
      and thus cgroup_clone() might access invalid cgroup_fs, or kill_sb() is
      called after cgroup_clone() created a new dir in it.
      
      The BUG I triggered is BUG_ON(root->number_of_cgroups != 1);
      
        ------------[ cut here ]------------
        kernel BUG at kernel/cgroup.c:1093!
        invalid opcode: 0000 [#1] SMP
        ...
        Process umount (pid: 5177, ti=e411e000 task=e40c4670 task.ti=e411e000)
        ...
        Call Trace:
         [<c0493df7>] ? deactivate_super+0x3f/0x51
         [<c04a3600>] ? mntput_no_expire+0xb3/0xdd
         [<c04a3ab2>] ? sys_umount+0x265/0x2ac
         [<c04a3b06>] ? sys_oldumount+0xd/0xf
         [<c0403911>] ? sysenter_do_call+0x12/0x31
        ...
        EIP: [<c0456e76>] cgroup_kill_sb+0x23/0xe0 SS:ESP 0068:e411ef2c
        ---[ end trace c766c1be3bf944ac ]---
      
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Paul Menage <menage@google.com>
      Cc: "Serge E. Hallyn" <serue@us.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b574b7b
    • Al Viro's avatar
      audit: validate comparison operations, store them in sane form · 5af75d8d
      Al Viro authored
      Don't store the field->op in the messy (and very inconvenient for e.g.
      audit_comparator()) form; translate to dense set of values and do full
      validation of userland-submitted value while we are at it.
      
      ->audit_init_rule() and ->audit_match_rule() get new values now; in-tree
      instances updated.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5af75d8d