1. 02 Mar, 2010 1 commit
    • Tao Ma's avatar
      ext4: Fix fencepost error in chosing choosing group vs file preallocation. · cc483f10
      Tao Ma authored
      The ext4 multiblock allocator decides whether to use group or file
      preallocation based on the file size.  When the file size reaches
      s_mb_stream_request (default is 16 blocks), it changes to use a
      file-specific preallocation. This is cool, but it has a tiny problem.
      
      See a simple script:
      mkfs.ext4 -b 1024 /dev/sda8 1000000
      mount -t ext4 -o nodelalloc /dev/sda8 /mnt/ext4
      for((i=0;i<5;i++))
      do
      cat /mnt/4096>>/mnt/ext4/a	#4096 is a file with 4096 characters.
      cat /mnt/4096>>/mnt/ext4/b
      done
      debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1
      
      And you get
      BLOCKS:
      (0-14):8705-8719, (15):2356, (16-19):8465-8468
      
      So there are 3 extents, a bit strange for the lonely 15th logical
      block.  As we write to the 16 blocks, we choose file preallocation in
      ext4_mb_group_or_file, but in ext4_mb_normalize_request, we meet with
      the 16*1024 range, so no preallocation will be carried. file b then
      reserves the space after '2356', so when when write 16, we start from
      another part.
      
      This patch just change the check in ext4_mb_group_or_file, so
      that for the lonely 15 we will still use group preallocation.
      After the patch, we will get:
      debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1
      BLOCKS:
      (0-15):8705-8720, (16-19):8465-8468
      
      Looks more sane. Thanks.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      cc483f10
  2. 24 Feb, 2010 1 commit
  3. 02 Mar, 2010 1 commit
  4. 24 Feb, 2010 2 commits
  5. 16 Feb, 2010 1 commit
    • Curt Wohlgemuth's avatar
      ext4: Fix BUG_ON at fs/buffer.c:652 in no journal mode · 73b50c1c
      Curt Wohlgemuth authored
      Calls to ext4_handle_dirty_metadata should only pass in an inode
      pointer for inode-specific metadata, and not for shared metadata
      blocks such as inode table blocks, block group descriptors, the
      superblock, etc.
      
      The BUG_ON can get tripped when updating a special device (such as a
      block device) that is opened (so that i_mapping is set in
      fs/block_dev.c) and the file system is mounted in no journal mode.
      
      Addresses-Google-Bug: #2404870
      Signed-off-by: default avatarCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      73b50c1c
  6. 15 Feb, 2010 1 commit
  7. 04 Mar, 2010 1 commit
  8. 15 Feb, 2010 2 commits
  9. 25 Jan, 2010 1 commit
  10. 24 Jan, 2010 1 commit
    • Theodore Ts'o's avatar
      ext4: Use bitops to read/modify EXT4_I(inode)->i_state · 19f5fb7a
      Theodore Ts'o authored
      At several places we modify EXT4_I(inode)->i_state without holding
      i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage,
      ext4_do_update_inode, ...). These modifications are racy and we can
      lose updates to i_state. So convert handling of i_state to use bitops
      which are atomic.
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      19f5fb7a
  11. 07 Dec, 2009 1 commit
    • Theodore Ts'o's avatar
      ext4: Use slab allocator for sub-page sized allocations · d2eecb03
      Theodore Ts'o authored
      Now that the SLUB seems to be fixed so that it respects the requested
      alignment, use kmem_cache_alloc() to allocator if the block size of
      the buffer heads to be allocated is less than the page size.
      Previously, we were using 16k page on a Power system for each buffer,
      even when the file system was using 1k or 4k block size.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      
      d2eecb03
  12. 01 Jan, 2010 1 commit
  13. 23 Dec, 2009 1 commit
  14. 22 Jan, 2010 1 commit
    • Theodore Ts'o's avatar
      ext4: Add block validity check when truncating indirect block mapped inodes · 1f2acb60
      Theodore Ts'o authored
      Add checks to ext4_free_branches() to make sure a block number found
      in an indirect block are valid before trying to free it.  If a bad
      block number is found, stop freeing the indirect block immediately,
      since the file system is corrupt and we will need to run fsck anyway.
      This also avoids spamming the logs, and specifically avoids
      driver-level "attempt to access beyond end of device" errors obscure
      what is really going on.
      
      If you get *really*, *really*, *really* unlucky, without this patch, a
      supposed indirect block containing garbage might contain a reference
      to a primary block group descriptor, in which case
      ext4_free_branches() could end up zero'ing out a block group
      descriptor block, and if then one of the block bitmaps for a block
      group described by that bg descriptor block is not in memory, and is
      read in by ext4_read_block_bitmap().  This function calls
      ext4_valid_block_bitmap(), which assumes that bg_inode_table() was
      validated at mount time and hasn't been modified since.  Since this
      assumption is no longer valid, it's possible for the value
      (ext4_inode_table(sb, desc) - group_first_block) to go negative, which
      will cause ext4_find_next_zero_bit() to trigger a kernel GPF.
      
      Addresses-Google-Bug: #2220436
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      1f2acb60
  15. 16 Feb, 2010 1 commit
    • Eric Sandeen's avatar
      ext4: Fix optional-arg mount options · 15121c18
      Eric Sandeen authored
      We have 2 mount options, "barrier" and "auto_da_alloc" which may or
      may not take a 1/0 argument.  This causes the ext4 superblock mount
      code to subtract uninitialized pointers and pass the result to
      kmalloc, which results in very noisy failures.
      
      Per Ted's suggestion, initialize the args struct so that
      we know whether match_token() found an argument for the
      option, and skip match_int() if not.
      
      Also, return error (0) from parse_options if we thought
      we found an argument, but match_int() Fails.
      Reported-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      15121c18
  16. 05 Feb, 2010 1 commit
  17. 12 Feb, 2010 12 commits
  18. 11 Feb, 2010 10 commits