An error occurred fetching the project authors.
  1. 04 Sep, 2009 2 commits
  2. 22 Jun, 2009 1 commit
    • Jan Kara's avatar
      ocfs2: Add lockdep annotations · cb25797d
      Jan Kara authored
      Add lockdep support to OCFS2. The support also covers all of the cluster
      locks except for open locks, journal locks, and local quotafile locks. These
      are special because they are acquired for a node, not for a particular process
      and lockdep cannot deal with such type of locking.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      cb25797d
  3. 03 Apr, 2009 4 commits
    • wengang wang's avatar
      ocfs2: fix rare stale inode errors when exporting via nfs · 6ca497a8
      wengang wang authored
      For nfs exporting, ocfs2_get_dentry() returns the dentry for fh.
      ocfs2_get_dentry() may read from disk when the inode is not in memory,
      without any cross cluster lock. this leads to the file system loading a
      stale inode.
      
      This patch fixes above problem.
      
      Solution is that in case of inode is not in memory, we get the cluster
      lock(PR) of alloc inode where the inode in question is allocated from (this
      causes node on which deletion is done sync the alloc inode) before reading
      out the inode itsself. then we check the bitmap in the group (the inode in
      question allcated from) to see if the bit is clear. if it's clear then it's
      stale. if the bit is set, we then check generation as the existing code
      does.
      
      We have to read out the inode in question from disk first to know its alloc
      slot and allot bit. And if its not stale we read it out using ocfs2_iget().
      The second read should then be from cache.
      
      And also we have to add a per superblock nfs_sync_lock to cover the lock for
      alloc inode and that for inode in question. this is because ocfs2_get_dentry()
      and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked
      in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so
      that mutliple ocfs2_delete_inode() can run concurrently in normal case.
      
      [mfasheh@suse.com: build warning fixes and comment cleanups]
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Acked-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      6ca497a8
    • Tao Ma's avatar
      ocfs2: Optimize inode allocation by remembering last group · 13821151
      Tao Ma authored
      In ocfs2, the inode block search looks for the "emptiest" inode
      group to allocate from. So if an inode alloc file has many equally
      (or almost equally) empty groups, new inodes will tend to get
      spread out amongst them, which in turn can put them all over the
      disk. This is undesirable because directory operations on conceptually
      "nearby" inodes force a large number of seeks.
      
      So we add ip_last_used_group in core directory inodes which records
      the last used allocation group. Another field named ip_last_used_slot
      is also added in case inode stealing happens. When claiming new inode,
      we passed in directory's inode so that the allocation can use this
      information.
      For more details, please see
      http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      13821151
    • Mark Fasheh's avatar
      ocfs2: Increase max links count · 198a1ca3
      Mark Fasheh authored
      Since we've now got a directory format capable of handling a large number of
      entries, we can increase the maximum link count supported. This only gets
      increased if the directory indexing feature is turned on.
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Acked-by: default avatarJoel Becker <joel.becker@oracle.com>
      198a1ca3
    • Mark Fasheh's avatar
      ocfs2: Add a name indexed b-tree to directory inodes · 9b7895ef
      Mark Fasheh authored
      This patch makes use of Ocfs2's flexible btree code to add an additional
      tree to directory inodes. The new tree stores an array of small,
      fixed-length records in each leaf block. Each record stores a hash value,
      and pointer to a block in the traditional (unindexed) directory tree where a
      dirent with the given name hash resides. Lookup exclusively uses this tree
      to find dirents, thus providing us with constant time name lookups.
      
      Some of the hashing code was copied from ext3. Unfortunately, it has lots of
      unfixed checkpatch errors. I left that as-is so that tracking changes would
      be easier.
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Acked-by: default avatarJoel Becker <joel.becker@oracle.com>
      9b7895ef
  4. 05 Jan, 2009 7 commits
    • Joel Becker's avatar
      ocfs2: Use metadata-specific ocfs2_journal_access_*() functions. · 13723d00
      Joel Becker authored
      The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2
      commit triggers and allow us to compute metadata ecc right before the
      buffers are written out.  This commit provides ecc for inodes, extent
      blocks, group descriptors, and quota blocks.  It is not safe to use
      extened attributes and metaecc at the same time yet.
      
      The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide
      the type of block at their root.  Before, it didn't matter, but now the
      root block must use the appropriate ocfs2_journal_access_*() function.
      To keep this abstract, the structures now have a pointer to the matching
      journal_access function and a wrapper call to call it.
      
      A few places use naked ocfs2_write_block() calls instead of adding the
      blocks to the journal.  We make sure to calculate their checksum and ecc
      before the write.
      
      Since we pass around the journal_access functions.  Let's typedef them
      in ocfs2.h.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      13723d00
    • Joel Becker's avatar
      ocfs2: block read meta ecc. · d6b32bbb
      Joel Becker authored
      Add block check calls to the read_block validate functions.  This is the
      almost all of the read-side checking of metaecc.  xattr buckets are not checked
      yet.   Writes are also unchecked, and so a read-write mount will quickly fail.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      d6b32bbb
    • Jan Kara's avatar
      ocfs2: Add quota calls for allocation and freeing of inodes and space · a90714c1
      Jan Kara authored
      Add quota calls for allocation and freeing of inodes and space, also update
      estimates on number of needed credits for a transaction. Move out inode
      allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called
      outside of a transaction.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      a90714c1
    • Jan Kara's avatar
      ocfs2: Mark system files as not subject to quota accounting · bbbd0eb3
      Jan Kara authored
      Mark system files as not subject to quota accounting. This prevents
      possible recursions into quota code and thus deadlocks.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      bbbd0eb3
    • Jan Kara's avatar
    • Joel Becker's avatar
      ocfs2: Validate metadata only when it's read from disk. · 970e4936
      Joel Becker authored
      Add an optional validation hook to ocfs2_read_blocks().  Now the
      validation function is only called when a block was actually read off of
      disk.  It is not called when the buffer was in cache.
      
      We add a buffer state bit BH_NeedsValidate to flag these buffers.  It
      must always be one higher than the last JBD2 buffer state bit.
      
      The dinode, dirblock, extent_block, and xattr_block validators are
      lifted to this scheme directly.  The group_descriptor validator needs to
      be split into two pieces.  The first part only needs the gd buffer and
      is passed to ocfs2_read_block().  The second part requires the dinode as
      well, and is called every time.  It's only 3 compares, so it's tiny.
      This also allows us to clean up the non-fatal gd check used by resize.c.
      It now has no magic argument.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      970e4936
    • Joel Becker's avatar
      ocfs2: Wrap inode block reads in a dedicated function. · b657c95c
      Joel Becker authored
      The ocfs2 code currently reads inodes off disk with a simple
      ocfs2_read_block() call.  Each place that does this has a different set
      of sanity checks it performs.  Some check only the signature.  A couple
      validate the block number (the block read vs di->i_blkno).  A couple
      others check for VALID_FL.  Only one place validates i_fs_generation.  A
      couple check nothing.  Even when an error is found, they don't all do
      the same thing.
      
      We wrap inode reading into ocfs2_read_inode_block().  This will validate
      all the above fields, going readonly if they are invalid (they never
      should be).  ocfs2_read_inode_block_full() is provided for the places
      that want to pass read_block flags.  Every caller is passing a struct
      inode with a valid ip_blkno, so we don't need a separate blkno argument
      either.
      
      We will remove the validation checks from the rest of the code in a
      later commit, as they are no longer necessary.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      b657c95c
  5. 10 Nov, 2008 1 commit
  6. 14 Oct, 2008 7 commits
  7. 13 Oct, 2008 2 commits
    • Tiger Yang's avatar
      ocfs2: Add extended attribute support · cf1d6c76
      Tiger Yang authored
      This patch implements storing extended attributes both in inode or a single
      external block. We only store EA's in-inode when blocksize > 512 or that
      inode block has free space for it. When an EA's value is larger than 80
      bytes, we will store the value via b-tree outside inode or block.
      Signed-off-by: default avatarTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      cf1d6c76
    • Mark Fasheh's avatar
      ocfs2: POSIX file locks support · 53da4939
      Mark Fasheh authored
      This is actually pretty easy since fs/dlm already handles the bulk of the
      work. The Ocfs2 userspace cluster stack module already uses fs/dlm as the
      underlying lock manager, so I only had to add the right calls.
      
      Cluster-aware POSIX locks ("plocks") can be turned off by the same means at
      UNIX locks - mount with 'noflocks', or create a local-only Ocfs2 volume.
      Internally, the file system uses two sets of file_operations, depending on
      whether cluster aware plocks is required. This turns out to be easier than
      implementing local-only versions of ->lock.
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      53da4939
  8. 25 Jan, 2008 5 commits
  9. 28 Nov, 2007 2 commits
  10. 12 Oct, 2007 2 commits
    • Mark Fasheh's avatar
      ocfs2: Write support for inline data · 1afc32b9
      Mark Fasheh authored
      This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline
      inode data.
      
      For the most part, the changes to the core write code can be relied on to do
      the heavy lifting. Any code calling ocfs2_write_begin (including shared
      writeable mmap) can count on it doing the right thing with respect to
      growing inline data to an extent tree.
      
      Size reducing truncates, including UNRESVP can simply zero that portion of
      the inode block being removed. Size increasing truncatesm, including RESVP
      have to be a little bit smarter and grow the inode to an extent tree if
      necessary.
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      Reviewed-by: default avatarJoel Becker <joel.becker@oracle.com>
      1afc32b9
    • Mark Fasheh's avatar
      ocfs2: Structure updates for inline data · 15b1e36b
      Mark Fasheh authored
      Add the disk, network and memory structures needed to support data in inode.
      
      Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing
      inline data.
      
      A new inode field, i_dyn_features, is added to facilitate tracking of
      dynamic inode state. Since it will be used often, we want to mirror it on
      ocfs2_inode_info, and transfer it via the meta data lvb.
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      Reviewed-by: default avatarJoel Becker <joel.becker@oracle.com>
      15b1e36b
  11. 08 May, 2007 1 commit
  12. 02 May, 2007 3 commits
  13. 26 Apr, 2007 3 commits
    • Mark Fasheh's avatar
      ocfs2: Cache extent records · 83418978
      Mark Fasheh authored
      The extent map code was ripped out earlier because of an inability to deal
      with holes. This patch adds back a simpler caching scheme requiring far less
      code.
      
      Our old extent map caching was designed back when meta data block caching in
      Ocfs2 didn't work very well, resulting in many disk reads. These days our
      metadata caching is much better, resulting in no un-necessary disk reads. As
      a result, extent caching doesn't have to be as fancy, nor does it have to
      cache as many extents. Keeping the last 3 extents seen should be sufficient
      to give us a small performance boost on some streaming workloads.
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      83418978
    • Mark Fasheh's avatar
      ocfs2: Fix up i_blocks calculation to know about holes · 8110b073
      Mark Fasheh authored
      Older file systems which didn't support holes did a dumb calculation of
      i_blocks based on i_size. This is no longer accurate, so fix things up to
      take actual allocation into account.
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      8110b073
    • Mark Fasheh's avatar
      ocfs2: Read from an unwritten extent returns zeros · 49cb8d2d
      Mark Fasheh authored
      Return an optional extent flags field from our lookup functions and wire up
      callers to treat unwritten regions as holes for the purpose of returning
      zeros to the user.
      Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
      49cb8d2d