- 10 Oct, 2008 4 commits
-
-
Frederic Bohe authored
This fixes a bug which caused on-line resizing of filesystems with a 1k blocksize to fail. The root cause of this bug was the fact that if an uninitalized bitmap block gets read in by userspace (which e2fsprogs does try to avoid, but can happen when the blocksize is less than the pagesize and an adjacent blocks is read into memory) ext4_read_block_bitmap() was erroneously depending on the buffer uptodate flag to decide whether it needed to initialize the bitmap block in memory --- i.e., to set the standard set of blocks in use by a block group (superblock, bitmaps, inode table, etc.). Essentially, ext4_read_block_bitmap() assumed it was the only routine that might try to read a block containing a block bitmap, which is simply not true. To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap() must always initialize uninitialized bitmap blocks. Once a block or inode is allocated out of that bitmap, it will be marked as initialized in the block group descriptor, so in general this won't result any extra unnecessary work. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
With modern hard drives, reading 64k takes roughly the same time as reading a 4k block. So request readahead for adjacent inode table blocks to reduce the time it takes when iterating over directories (especially when doing this in htree sort order) in a cold cache case. With this patch, the time it takes to run "git status" on a kernel tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches" is reduced by 21%. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Alex Tomas <bzzz@sun.com> Cc: Andreas Dilger <adilger@sun.com>
-
- 23 Sep, 2008 2 commits
-
-
Theodore Ts'o authored
Previously mballoc created a separate set of functions for each proc file. This combines the tunables into a single set of functions which gets used for all of the per-superblock proc files, saving approximately 2k of compiled object code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
...and into the core setup/teardown code in fs/ext4/super.c so that other parts of ext4 can define tuning parameters. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 22 Sep, 2008 1 commit
-
-
Theodore Ts'o authored
This is a port of a patch from Linus which fixes a 200+ byte stack usage problem in ext4_get_parent(). It's more efficient to pass down only the actual parts of the dentry that matter: the parent inode and the name, instead of allocating a struct dentry on the stack. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 07 Oct, 2008 1 commit
-
-
Theodore Ts'o authored
This fixes some very common warnings reported by kerneloops.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 13 Sep, 2008 2 commits
-
-
Eric Sandeen authored
lg_prealloc_list seems to cry out for a per-cpu data structure; on a large smp system I think this should be better. I've lightly tested this change on a 4-cpu system. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with other ext4 ioctl's. Since there haven't been any major userspace users of this ioctl, we can afford to change this now, to avoid potential problems later. Also, reorder the ioctl numbers in ext4.h to avoid this sort of mistake in the future. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Oct, 2008 1 commit
-
-
Aneesh Kumar K.V authored
This patch hooks the ext3 to ext4 migrate interface to EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e. We only allow setting extent flags. Clearing extent flag (migrating from ext4 to ext3) is not supported. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 13 Sep, 2008 1 commit
-
-
Aneesh Kumar K.V authored
The migrate ioctl writes to the filsystem, so we need to elevate the write count. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 08 Sep, 2008 1 commit
-
-
Li Zefan authored
If there group descriptors are corrupted we need unlock the block group lock before returning from the function; else we will oops when freeing a spinlock which is still being held. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 16 Sep, 2008 1 commit
-
-
Theodore Ts'o authored
Calculate the journal device name once and stash it away in the journal_s structure. This avoids needing to call bdevname() everywhere and reduces stack usage by not needing to allocate an on-stack buffer. In addition, we eliminate the '/' that can appear in device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that can cause problems when creating proc directory names, and include the inode number to support ocfs2 which creates multiple journals with different inode numbers. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 14 Sep, 2008 1 commit
-
-
Alexey Dobriyan authored
ext4 creates per-suberblock directory in /proc/ext4/ . Name used as basis is taken from bdevname, which, surprise, can contain slash. However, proc while allowing to use proc_create("a/b", parent) form of PDE creation, assumes that parent/a was already created. bdevname in question is 'cciss/c0d0p9', directory is not created and all this stuff goes directly into /proc (which is real bug). Warning comes when _second_ partition is mounted. http://bugzilla.kernel.org/show_bug.cgi?id=11321Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 08 Sep, 2008 1 commit
-
-
Frederic Bohe authored
This fixes a bug which prevented the newly created inodes after a resize from being used on filesystems with flex_bg. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Oct, 2008 1 commit
-
-
Eric Sandeen authored
Note: some people thinks this represents a security bug, since it might make the system go away while it is printing a large number of console messages, especially if a serial console is involved. Hence, it has been assigned CVE-2008-3528, but it requires that the attacker either has physical access to your machine to insert a USB disk with a corrupted filesystem image (at which point why not just hit the power button), or is otherwise able to convince the system administrator to mount an arbitrary filesystem image (at which point why not just include a setuid shell or world-writable hard disk device file or some such). Me, I think they're just being silly. --tytso Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-ext4@vger.kernel.org Cc: Eugene Teo <eugeneteo@kernel.sg>
-
- 13 Sep, 2008 2 commits
-
-
Aneesh Kumar K.V authored
With delayed allocation we use i_data_sem to update i_disksize. We need to update i_disksize only if the new size specified is greater than the current value and we need to make sure we don't race with other i_disksize update. With delayed allocation we will switch to the write_begin function for non-delayed allocation if we are low on free blocks. This means the write_begin function for non-delayed allocation also needs to use the same locking. We also need to check and update i_disksize even if the new size is less that inode.i_size because of delayed allocation. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Aneesh Kumar K.V authored
For blocksize < pagesize we need to remove blocks that got allocated in block_write_begin() if we fail with ENOSPC for later blocks. block_write_begin() internally does this if it allocated pages locally. This makes sure we don't have blocks outside inode.i_size during ENOSPC. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Sep, 2008 3 commits
-
-
Aneesh Kumar K.V authored
When we truncate files, the meta-data blocks released are not reused untill we commit the truncate transaction. That means delayed get_block request will return ENOSPC even if we have free blocks left. Force a journal commit and retry block allocation if we get ENOSPC with free blocks left. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Aneesh Kumar K.V authored
Make sure we don't add the inode to the journal handle until after the block allocation, so that a journal commit will not include the inode in case of block allocation failure. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Aneesh Kumar K.V authored
We run into ENOSPC error on nonmballoc ext4, even when there is free blocks on the filesystem. The patch includes two changes: a) Set reservation to NULL if we trying to allocate near group_target_block from the goal group if the free block in the group is less than windows. This should give us a better chance to allocate near group_target_block. This also ensures that if we are not allocating near group_target_block then we don't trun off reservation. This should enable us to allocate with reservation from other groups that have large free blocks count. b) we don't need to check the window size if the block reservation is off. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Oct, 2008 2 commits
-
-
Aneesh Kumar K.V authored
This patch converts some usage of ext4_fsblk_t to s64. This is needed so that some of the sign conversion works as expected in if loops. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Aneesh Kumar K.V authored
The delayed allocation code allocates blocks during writepages(), which can not handle block allocation failures. To deal with this, we switch away from delayed allocation mode when we are running low on free blocks. This also allows us to avoid needing to reserve a large number of meta-data blocks in case all of the requested blocks are discontiguous. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 10 Oct, 2008 1 commit
-
-
Aneesh Kumar K.V authored
This patch adds dirty block accounting using percpu_counters. Delayed allocation block reservation is now done by updating dirty block counter. In a later patch we switch to non delalloc mode if the filesystem free blocks is greater than 150% of total filesystem dirty blocks Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Sep, 2008 1 commit
-
-
Aneesh Kumar K.V authored
During block reservation if we don't have enough blocks left, retry block reservation with smaller block counts. This makes sure we try fallocate and DIO with smaller request size and don't fail early. The delayed allocation reservation cannot try with smaller block count. So retry block reservation to handle temporary disk full conditions. Also print free blocks details if we fail block allocation during writepages. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Oct, 2008 1 commit
-
-
Aneesh Kumar K.V authored
With delayed allocation we need to make sure block are reserved before we attempt to allocate them. Otherwise we get block allocation failure (ENOSPC) during writepages which cannot be handled. This would mean silent data loss (We do a printk stating data will be lost). This patch updates the DIO and fallocate code path to do block reservation before block allocation. This is needed to make sure parallel DIO and fallocate request doesn't take block out of delayed reserve space. When free blocks count go below a threshold we switch to a slow patch which looks at other CPU's accumulated percpu counter values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 20 Aug, 2008 1 commit
-
-
Aneesh Kumar K.V authored
We are a bit agressive in invalidating all the pages. But it is ok because we really don't know why the block allocation failed and it is better to come of the writeback path so that user can look for more info. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
- 09 Sep, 2008 3 commits
-
-
Theodore Ts'o authored
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 Oct, 2008 9 commits
-
-
Mingming Cao authored
percpu_counter_sum_and_set() and percpu_counter_sum() is the same except the former updates the global counter after accounting. Since we are taking the fbc->lock to calculate the precise value of the counter in percpu_counter_sum() anyway, it should simply set fbc->count too, as the percpu_counter_sum_and_set() does. This patch merges these two interfaces into one. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Linus Torvalds authored
-
Linus Torvalds authored
This is debatable, but while we're debating it, let's disallow the combination of splice and an O_APPEND destination. It's not entirely clear what the semantics of O_APPEND should be, and POSIX apparently expects pwrite() to ignore O_APPEND, for example. So we could make up any semantics we want, including the old ones. But Miklos convinced me that we should at least give it some thought, and that accepting writes at arbitrary offsets is wrong at least for IS_APPEND() files (which always have O_APPEND set, even if the reverse isn't true: you can obviously have O_APPEND set on a regular file). So disallow O_APPEND entirely for now. I doubt anybody cares, and this way we have one less gray area to worry about. Reported-and-argued-for-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Jens Axboe <ens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://jdelvare.pck.nerim.net/jdelvare-2.6Linus Torvalds authored
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: hwmon: (abituguru3) Enable DMI probing feature on Abit AT8 32X hwmon: (abituguru3) Enable reading from AUX3 fan on Abit AT8 32X hwmon: (adt7473) Fix some bogosity in documentation file hwmon: Define sysfs interface for energy consumption register hwmon: (it87) Prevent power-off on Shuttle SN68PT eeepc-laptop: Fix hwmon interface
-
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreqLinus Torvalds authored
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] correct broken links and email addresses
-
Matt Mackall authored
This fixes the previous fix, which was completely wrong on closer inspection. This version has been manually tested with a user-space test harness and generates sane values. A nearly identical patch has been boot-tested. The problem arose from changing how kmalloc/kfree handled alignment padding without updating ksize to match. This brings it in sync. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Németh Márton authored
Replace the no longer working links and email address in the documentation and in source code. Signed-off-by: Márton Németh <nm127@freemail.hu> Signed-off-by: Dave Jones <davej@redhat.com>
-
Alistair John Strachan authored
Enable driver checking of the DMI product name (when enabled) on an Abit AT8 32X, instead of falling back to a manual probe. This eliminates false negatives and eventually will help avoid unnecessary bus probes on unsupported mainboards. Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk> Tested-by: Daniel Exner <dex@dragonslave.de> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
-
Alistair John Strachan authored
The table for the Abit AT8 32X was incorrectly missing an entry for the sixth ("AUX3") fan. Add this entry, exporting the fan reading to userspace. Closes lm-sensors.org ticket #2339. Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk> Tested-by: Daniel Exner <dex@dragonslave.de> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
-