Commits · f8514083cd61daef12fba5ef883ad9352c450428 · linux / linux-davinci

05 Jun, 2009 2 commits

ext4: truncate the file properly if we fail to copy data from userspace · f8514083

Aneesh Kumar K.V authored Jun 05, 2009

In generic_perform_write if we fail to copy the user data we don't
update the inode->i_size.  We should truncate the file in the above
case so that we don't have blocks allocated outside inode->i_size.  Add
the inode to orphan list in the same transaction as block allocation
This ensures that if we crash in between the recovery would do the
truncate.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC:  Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

f8514083

ext4: Avoid leaking blocks after a block allocation failure · 1938a150

Aneesh Kumar K.V authored Jun 05, 2009

We should add inode to the orphan list in the same transaction
as block allocation.  This ensures that if we crash after a failed
block allocation and before we do a vmtruncate we don't leak block
(ie block marked as used in bitmap but not claimed by the inode).
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC:  Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

1938a150

04 Jun, 2009 1 commit

ext4: Change all super.c messages to print the device · b31e1552

Eric Sandeen authored Jun 04, 2009

This patch changes ext4 super.c to include the device name with all 
warning/error messages, by using a new utility function ext4_msg. 
It's a rather large patch, but very mechanic. I left debug printks
alone.

This is a straightforward port of a patch which Andi Kleen did for
ext3.

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

b31e1552

09 Jun, 2009 1 commit

ext4: Get rid of EXTEND_DISKSIZE flag of ext4_get_blocks_handle() · 03f5d8bc

Jan Kara authored Jun 09, 2009

Get rid of EXTEND_DISKSIZE flag of ext4_get_blocks_handle(). This
seems to be a relict from some old days and setting disksize in this
function does not make much sense.  Currently it was set only by
ext4_getblk().  Since the parameter has some effect only if create ==
1, it is easy to check by grepping through the sources that the three
callers which end up calling ext4_getblk() with create == 1
(ext4_append, ext4_quota_write, ext4_mkdir) do the right thing and set
disksize themselves.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

03f5d8bc

03 Jun, 2009 1 commit

ext4: super.c whitespace cleanup · 0b8e58a1

Andreas Dilger authored Jun 03, 2009

Cleanup of whitespace and formatting.  Initially driven by confusing indents
for the ext4_{block,inode}_bitmap() et. al. helper routines, but figured I'd
cleanup some other 80-column wrapping and other indenting problems at the
same time.
Signed-off-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0b8e58a1

09 Jun, 2009 1 commit
- jbd2: Fix minor typos in comments in fs/jbd2/journal.c · bfcd3555
  Alberto Bertogli authored Jun 09, 2009
```
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
```
  bfcd3555
25 May, 2009 2 commits

ext4: Clean up calls to ext4_get_group_desc() · 88b6edd1

Theodore Ts'o authored May 25, 2009

If the caller isn't planning on modifying the block group descriptors,
there's no need to pass in a pointer to a struct buffer_head. Nuking
this saves a tiny amount of CPU time and stack space usage.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

88b6edd1

ext4: remove unused function __ext4_write_dirty_metadata · 759d427a

Theodore Ts'o authored May 25, 2009

The __ext4_write_dirty_metadata() function was introduced by commit
0390131b, "ext4: Allow ext4 to run without a journal", but nothing
ever used the function, either then or since.  So let's remove it and
save a bit of space.

Cc: Frank Mayhar <fmayhar@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

759d427a

18 May, 2009 4 commits

ext2: Fix memory leak in ext2_fill_super() in case of a failed mount · 0f7ee7c1
Manish Katiyar authored May 17, 2009
```
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
0f7ee7c1
ext3: Fix memory leak in ext3_fill_super() in case of a failed mount · de5ce037
Manish Katiyar authored May 17, 2009
```
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
de5ce037
ext4: Fix memory leak in ext4_fill_super() in case of a failed mount · f6830165
Manish Katiyar authored May 17, 2009
```
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
f6830165

ext4: down i_data_sem only for read when walking tree for fiemap · 0568c518

Theodore Ts'o authored May 17, 2009

Not sure why I put this in as down_write originally; all we are
doing is walking the tree, nothing will change under us and
concurrent reads should be no problem.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0568c518

17 May, 2009 1 commit

ext4: Add a comprehensive block validity check to ext4_get_blocks() · 6fd058f7

Theodore Ts'o authored May 17, 2009

To catch filesystem bugs or corruption which could lead to the
filesystem getting severly damaged, this patch adds a facility for
tracking all of the filesystem metadata blocks by contiguous regions
in a red-black tree. This allows quick searching of the tree to
locate extents which might overlap with filesystem metadata blocks.

This facility is also used by the multi-block allocator to assure that
it is not allocating blocks out of the system zone, as well as by the
routines used when reading indirect blocks and extents information
from disk to make sure their contents are valid.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

6fd058f7

14 May, 2009 2 commits

ext4: Clean up ext4_get_blocks() so it does not depend on bh_result->b_state · 2ac3b6e0

Theodore Ts'o authored May 14, 2009

The ext4_get_blocks() function was depending on the value of
bh_result->b_state as an input parameter to decide whether or not
update the delalloc accounting statistics by calling
ext4_da_update_reserve_space().  We now use a separate flag,
EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE, to requests this update, so that
all callers of ext4_get_blocks() can clear map_bh.b_state before
calling ext4_get_blocks() without worrying about any consistency
issues.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2ac3b6e0

ext4: Merge ext4_da_get_block_write() into mpage_da_map_blocks() · 2fa3cdfb

Theodore Ts'o authored May 14, 2009

The static function ext4_da_get_block_write() was only used by
mpage_da_map_blocks().  So to simplify the code, merge that function
into mpage_da_map_blocks().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2fa3cdfb

12 May, 2009 1 commit

ext4: Add BUG_ON debugging checks to noalloc_get_block_write() · a2dc52b5

Theodore Ts'o authored May 12, 2009

Enforce that noalloc_get_block_write() is only called to map one block
at a time, and that it always is successful in finding a mapping for
given an inode's logical block block number if it is called with
create == 1.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

a2dc52b5

14 May, 2009 3 commits

ext4: Add documentation to the ext4_*get_block* functions · b920c755

Theodore Ts'o authored May 14, 2009

This adds more documentation to various internal functions in
fs/ext4/inode.c, most notably ext4_ind_get_blocks(),
ext4_da_get_block_write(), ext4_da_get_block_prep(),
ext4_normal_get_block_write().

In addition, the static function ext4_normal_get_block_write() has
been renamed noalloc_get_block_write(), since it is used in many
places far beyond ext4_normal_writepage().

Plenty of warnings have been added to the noalloc_get_block_write()
function, since the way it is used is amazingly fragile.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

b920c755

ext4: Define a new set of flags for ext4_get_blocks() · c2177057

Theodore Ts'o authored May 14, 2009

The functions ext4_get_blocks(), ext4_ext_get_blocks(), and
ext4_ind_get_blocks() used an ad-hoc set of integer variables used as
boolean flags passed in as arguments. Use a single flags parameter
and a setandard set of bitfield flags instead. This saves space on
the call stack, and it also makes the code a bit more understandable.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

c2177057

ext4: Rename ext4_get_blocks_wrap() to be ext4_get_blocks() · 12b7ac17

Theodore Ts'o authored May 14, 2009

Another function rename for clarity's sake.  The _wrap prefix simply
confuses people, and didn't add much people trying to follow the code
paths.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

12b7ac17

12 May, 2009 2 commits

ext4: Rename ext4_get_blocks_handle() to be ext4_ind_get_blocks() · e4d996ca

Theodore Ts'o authored May 12, 2009

The static function ext4_get_blocks_handle() is badly named.  Of
*course* it takes a handle.  Since its counterpart for extent-based
file is ext4_ext_get_blocks(), rename it to be ext4_ind_get_blocks().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e4d996ca

ext4: Simplify function signature for ext4_da_get_block_write() · f888e652

Theodore Ts'o authored May 12, 2009

The function ext4_da_get_block_write() is called in exactly one write,
and the last argument, create, is always 1.  Remove it to simplify the
code slightly.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

f888e652

15 May, 2009 1 commit

ext4: Fix spinlock assertions on UP systems · bc8e6740

Vincent Minet authored May 15, 2009

On UP systems without DEBUG_SPINLOCK, ext4_is_group_locked always fails
which triggers a BUG_ON() call.
This patch fixes it by using assert_spin_locked instead.
Signed-off-by: Vincent Minet <vincent@vincent-minet.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

bc8e6740

03 May, 2009 1 commit

ext4: Convert ext4_lock_group to use sb_bgl_lock · 955ce5f5

Aneesh Kumar K.V authored May 02, 2009

We have sb_bgl_lock() and ext4_group_info.bb_state
bit spinlock to protech group information. The later is only
used within mballoc code. Consolidate them to use sb_bgl_lock().
This makes the mballoc.c code much simpler and also avoid
confusion with two locks protecting same info.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

955ce5f5

02 May, 2009 2 commits

ext4: fix the length returned by fiemap for an unallocated extent · eefd7f03

Theodore Ts'o authored May 02, 2009

If the file's blocks have not yet been allocated because of delayed
allocation, the length of the extent returned by fiemap is incorrect.
This commit fixes this bug.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

eefd7f03

ext4: fix for fiemap last-block test · c9877b20

Eric Sandeen authored May 01, 2009

Carl Henrik Lunde reported and debugged this; the test for the
last allocated block was comparing bytes to blocks in this test:

	if (logical + length - 1 == EXT_MAX_BLOCK ||
	    ext4_ext_next_allocated_block(path) == EXT_MAX_BLOCK)
		flags |= FIEMAP_EXTENT_LAST;

so any extent which ended right at 4G was stopping the extent
walk.  Just replacing these values with the extent block &
length should fix it.

Also give blksize_bits a saner type, and reverse the order 
of the tests to make the more likely case tested first.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Tested-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

c9877b20

13 May, 2009 1 commit

vfs: Enable FS_IOC_FIEMAP and FIGETBSZ for all filetypes · 19ba0559

Aneesh Kumar K.V authored May 13, 2009

The fiemap and get_blk_size ioctls should be enabled even for
directories.  So move it outisde file_ioctl.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

19ba0559

03 May, 2009 1 commit

ext4: hook fiemap operation for directories · abc8746e

Aneesh Kumar K.V authored May 02, 2009

Add fiemap callback for directories
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

abc8746e

02 May, 2009 1 commit

ext4: Make the length of the mb_history file tunable · f4033903

Curt Wohlgemuth authored May 01, 2009

In memory-constrained systems with many partitions, the ~68K for each
partition for the mb_history buffer can be excessive.

This patch adds a new mount option, mb_history_length, as well as a
way of setting the default via a module parameter (or via a sysfs
parameter in /sys/module/ext4/parameter/default_mb_history_length).
If the mb_history_length is set to zero, the mb_history facility is
disabled entirely.
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

f4033903

01 May, 2009 2 commits

ext4: Move fs/ext4/group.h into ext4.h · bb23c20a

Theodore Ts'o authored May 01, 2009

Move the function prototypes in group.h into ext4.h so they are all
defined in one place.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

bb23c20a

ext4: Move fs/ext4/namei.h into ext4.h · 596397b7

Theodore Ts'o authored May 01, 2009

The fs/ext4/namei.h header file had only a single function
declaration, and should have never been a standalone file.  Move it
into ext4.h, where should have been from the beginning.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

596397b7

03 May, 2009 1 commit

ext4: Move the ext4_sb.h header file into ext4.h · ca0faba0

Theodore Ts'o authored May 03, 2009

There is no longer a reason for a separate ext4_sb.h header file, so
move it into ext4.h just to make life easier for developers to find
the relevant data structures and typedefs.  Should also speed up
compiles slightly, too.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ca0faba0

01 May, 2009 2 commits

ext4: Move the ext4_i.h header file into ext4.h · d444c3c3

Theodore Ts'o authored May 01, 2009

There is no longer a reason for a separate ext4_i.h header file, so
move it into ext4.h just to make life easier for developers to find
the relevant data structures and typedefs.  Should also speed up
compiles slightly, too.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

d444c3c3

ext4: Don't avoid using BLOCK_UNINIT block groups in mballoc · 75507efb

Theodore Ts'o authored May 01, 2009

By avoiding the use of not-yet-used block groups (i.e., block groups
with the BLOCK_UNINIT flag), mballoc had a tendency to create large
files with large non-contiguous gaps.  In addition avoiding the use of
new block groups had a tendency to push regular file data into the
first block group in a flex_bg group, which slows down the speed of
e2fsck pass 2, since it has a tendency to seek much more.  For
example:

               Before Patch                       After Patch
              Time in seconds                   Time in seconds
            Real /  User/  Sys   MB/s      Real /  User/  Sys    MB/s
Pass 1      8.52 / 2.21 / 0.46  20.43      8.84 / 4.97 / 1.11   19.68
Pass 2     21.16 / 1.02 / 1.86  11.30      6.54 / 1.77 / 1.78   36.39
Pass 3      0.01 / 0.00 / 0.00 139.00      0.01 / 0.01 / 0.00  128.90
Pass 4      0.16 / 0.15 / 0.00   0.00      0.17 / 0.17 / 0.00    0.00
Pass 5      2.52 / 1.99 / 0.09   0.79      2.31 / 1.78 / 0.06    0.86
Total      32.40 / 5.11 / 2.49  12.81     17.99 / 8.75 / 2.98   23.01

This was on a sample 80 gig root filesystem which was approximately
50% full.  Note the improved e2fsck pass 2 performance, by over a
factor of 3, due to a decreased number of seeks.  (The total amount of
I/O in pass 2 was unchanged; the layout of the directory blocks was
simply much better from e2fsck's's perspective.)

Other changes as a result of this patch on this sample filesystem:

                             Before Patch    After Patch
# of non-contig files           762             779
# of non-contig directories     571             570
# of BLOCK_UNINIT bg's          307             293
# of INODE_UNINIT bg's          503             503

Out of 640 block groups, of which 333 were in use, this patch caused
an extra 14 block groups to be utilized.  The number of non-contiguous
files did go up slightly, but when measured against the 99.9% of the
files (603,154) which were contiguously allocated, this is pretty
insignificant.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andreas Dilger <adilger@sun.com>

75507efb

26 Apr, 2009 2 commits

ext4: Replace lock/unlock_super() with an explicit lock for resizing · 32ed5058

Theodore Ts'o authored Apr 25, 2009

    
Use a separate lock to protect s_groups_count and the other block
group descriptors which get changed via an on-line resize operation,
so we can stop overloading the use of lock_super().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

32ed5058

ext4: Replace lock/unlock_super() with an explicit lock for the orphan list · 3b9d4ed2

Theodore Ts'o authored Apr 25, 2009

Use a separate lock to protect the orphan list, so we can stop
overloading the use of lock_super().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

3b9d4ed2

01 May, 2009 1 commit

ext4: ext4_mark_recovery_complete() doesn't need to use lock_super · a63c9eb2

Theodore Ts'o authored May 01, 2009

The function ext4_mark_recovery_complete() is called from two call
paths: either (a) while mounting the filesystem, in which case there's
no danger of any other CPU calling write_super() until the mount is
completed, and (b) while remounting the filesystem read-write, in
which case the fs core has already locked the superblock.  This also
allows us to take out a very vile unlock_super()/lock_super() pair in
ext4_remount().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

a63c9eb2

25 Apr, 2009 1 commit

ext4: Remove outdated comment about lock_super() · 114e9fc9

Theodore Ts'o authored Apr 25, 2009

ext4_fill_super() is no longer called by read_super(), and it is no
longer called with the superblock locked.  The
unlock_super()/lock_super() is no longer present, so this comment is
entirely superfluous.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

114e9fc9

01 May, 2009 3 commits

ext4: Avoid races caused by on-line resizing and SMP memory reordering · 8df9675f

Theodore Ts'o authored May 01, 2009

Ext4's on-line resizing adds a new block group and then, only at the
last step adjusts s_groups_count. However, it's possible on SMP
systems that another CPU could see the updated the s_group_count and
not see the newly initialized data structures for the just-added block
group. For this reason, it's important to insert a SMP read barrier
after reading s_groups_count and before reading any (for example) the
new block group descriptors allowed by the increased value of
s_groups_count.

Unfortunately, we rather blatently violate this locking protocol
documented in fs/ext4/resize.c. Fortunately, (1) on-line resizes
happen relatively rarely, and (2) it seems rare that the filesystem
code will immediately try to use just-added block group before any
memory ordering issues resolve themselves. So apparently problems
here are relatively hard to hit, since ext3 has been vulnerable to the
same issue for years with no one apparently complaining.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

8df9675f

ext4: Use separate super_operations structure for no_journal filesystems · 9ca92389

Theodore Ts'o authored May 01, 2009

By using a separate super_operations structure for filesystems that
have and don't have journals, we can simply ext4_write_super() ---
which is only needed when no journal is present --- and ext4_freeze(),
ext4_unfreeze(), and ext4_sync_fs(), which are only needed when the
journal is present.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9ca92389

ext4: Fix and simplify s_dirt handling · 7234ab2a

Theodore Ts'o authored Apr 30, 2009

The s_dirt flag wasn't completely handled correctly, but it didn't
really matter when journalling was enabled. It turns out that when
ext4 runs without a journal, we don't clear s_dirt in places where we
should have, with the result that the high-level write_super()
function was writing the superblock when it wasn't necessary.

So we fix this by making ext4_commit_super() clear the s_dirt flag,
and removing many of the other places where s_dirt is manipulated.
When journalling is enabled, the s_dirt flag might be left set more
often, but s_dirt really doesn't matter when journalling is enabled.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

7234ab2a