• Fengguang Wu's avatar
    reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file · a47a72d3
    Fengguang Wu authored
    patch c06a018f in mainline.
    
    This is not a new problem in 2.6.23-git17.  2.6.22/2.6.23 is buggy in the
    same way.
    
    Reiserfs could accumulate dirty sub-page-size files until umount time. 
    They cannot be synced to disk by pdflush routines or explicit `sync'
    commands.  Only `umount' can do the trick.
    
    The direct cause is: the dirty page's PG_dirty is wrongly _cleared_.
    Call trace:
    	 [<ffffffff8027e920>] cancel_dirty_page+0xd0/0xf0
    	 [<ffffffff8816d470>] :reiserfs:reiserfs_cut_from_item+0x660/0x710
    	 [<ffffffff8816d791>] :reiserfs:reiserfs_do_truncate+0x271/0x530
    	 [<ffffffff8815872d>] :reiserfs:reiserfs_truncate_file+0xfd/0x3b0
    	 [<ffffffff8815d3d0>] :reiserfs:reiserfs_file_release+0x1e0/0x340
    	 [<ffffffff802a187c>] __fput+0xcc/0x1b0
    	 [<ffffffff802a1ba6>] fput+0x16/0x20
    	 [<ffffffff8029e676>] filp_close+0x56/0x90
    	 [<ffffffff8029fe0d>] sys_close+0xad/0x110
    	 [<ffffffff8020c41e>] system_call+0x7e/0x83
    
    Fix the bug by removing the cancel_dirty_page() call. Tests show that
    it causes no bad behaviors on various write sizes.
    
    === for the patient ===
    Here are more detailed demonstrations of the problem.
    
    1) the page has both PG_dirty(D)/PAGECACHE_TAG_DIRTY(d) after being written to;
       and then only PAGECACHE_TAG_DIRTY(d) remains after the file is closed.
    
    ------------------------------ screen 0 ------------------------------
    [T0] root /home/wfg# cat > /test/tiny
    [T1] hi
    [T2] root /home/wfg#
    
    ------------------------------ screen 1 ------------------------------
    [T1] root /home/wfg# echo /test/tiny > /proc/filecache
    [T1] root /home/wfg# cat /proc/filecache
         # file /test/tiny
         # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
         # idx   len     state   refcnt
         0       1       ___UD__Bd_      2
    [T2] root /home/wfg# cat /proc/filecache
         # file /test/tiny
         # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback
         # idx   len     state   refcnt
         0       1       ___U___Bd_      2
    
    2) note the non-zero 'cancelled_write_bytes' after /tmp/hi is copied.
    
    ------------------------------ screen 0 ------------------------------
    [T0] root /home/wfg# echo hi > /tmp/hi
    [T1] root /home/wfg# cp /tmp/hi /dev/stdin /test
    [T2] hi
    [T3] root /home/wfg#
    
    ------------------------------ screen 1 ------------------------------
    [T1] root /proc/4397# cd /proc/`pidof cp`
    [T1] root /proc/4713# cat io
         rchar: 8396
         wchar: 3
         syscr: 20
         syscw: 1
         read_bytes: 0
         write_bytes: 20480
         cancelled_write_bytes: 4096
    [T2] root /proc/4713# cat io
         rchar: 8399
         wchar: 6
         syscr: 21
         syscw: 2
         read_bytes: 0
         write_bytes: 24576
         cancelled_write_bytes: 4096
    
    //Question: the 'write_bytes' is a bit more than expected ;-)
    Tested-by: default avatarMaxim Levitsky <maximlevitsky@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Jeff Mahoney <jeffm@suse.com>
    Signed-off-by: default avatarFengguang Wu <wfg@mail.ustc.edu.cn>
    Reviewed-by: default avatarChris Mason <chris.mason@oracle.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
    a47a72d3
stree.c 66.5 KB