Commit ccd979bd authored by Mark Fasheh's avatar Mark Fasheh Committed by Joel Becker

[PATCH] OCFS2: The Second Oracle Cluster Filesystem

The OCFS2 file system module.
Signed-off-by: default avatarMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: default avatarKurt Hackel <kurt.hackel@oracle.com>
parent 8df08c89
......@@ -36,6 +36,8 @@ ntfs.txt
- info and mount options for the NTFS filesystem (Windows NT).
proc.txt
- info on Linux's /proc filesystem.
ocfs2.txt
- info and mount options for the OCFS2 clustered filesystem.
romfs.txt
- Description of the ROMFS filesystem.
smbfs.txt
......
OCFS2 filesystem
==================
OCFS2 is a general purpose extent based shared disk cluster file
system with many similarities to ext3. It supports 64 bit inode
numbers, and has automatically extending metadata groups which may
also make it attractive for non-clustered use.
You'll want to install the ocfs2-tools package in order to at least
get "mount.ocfs2" and "ocfs2_hb_ctl".
Project web page: http://oss.oracle.com/projects/ocfs2
Tools web page: http://oss.oracle.com/projects/ocfs2-tools
OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/
All code copyright 2005 Oracle except when otherwise noted.
CREDITS:
Lots of code taken from ext3 and other projects.
Authors in alphabetical order:
Joel Becker <joel.becker@oracle.com>
Zach Brown <zach.brown@oracle.com>
Mark Fasheh <mark.fasheh@oracle.com>
Kurt Hackel <kurt.hackel@oracle.com>
Sunil Mushran <sunil.mushran@oracle.com>
Manish Singh <manish.singh@oracle.com>
Caveats
=======
Features which OCFS2 does not support yet:
- sparse files
- extended attributes
- shared writeable mmap
- loopback is supported, but data written will not
be cluster coherent.
- quotas
- cluster aware flock
- Directory change notification (F_NOTIFY)
- Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease)
- POSIX ACLs
- readpages / writepages (not user visible)
Mount options
=============
OCFS2 supports the following mount options:
(*) == default
barrier=1 This enables/disables barriers. barrier=0 disables it,
barrier=1 enables it.
errors=remount-ro(*) Remount the filesystem read-only on an error.
errors=panic Panic and halt the machine if an error occurs.
intr (*) Allow signals to interrupt cluster operations.
nointr Do not allow signals to interrupt cluster
operations.
......@@ -1905,6 +1905,15 @@ M: ajoshi@shell.unixbox.com
L: linux-nvidia@lists.surfsouth.com
S: Maintained
ORACLE CLUSTER FILESYSTEM 2 (OCFS2)
P: Mark Fasheh
M: mark.fasheh@oracle.com
P: Kurt Hackel
M: kurt.hackel@oracle.com
L: ocfs2-devel@oss.oracle.com
W: http://oss.oracle.com/projects/ocfs2/
S: Supported
OLYMPIC NETWORK DRIVER
P: Peter De Shrijver
M: p2@ace.ulyssis.student.kuleuven.ac.be
......
EXTRA_CFLAGS += -Ifs/ocfs2
EXTRA_CFLAGS += -DCATCH_BH_JBD_RACES
obj-$(CONFIG_OCFS2_FS) += ocfs2.o
ocfs2-objs := \
alloc.o \
aops.o \
buffer_head_io.o \
dcache.o \
dir.o \
dlmglue.o \
export.o \
extent_map.o \
file.o \
heartbeat.o \
inode.o \
journal.o \
localalloc.o \
mmap.o \
namei.o \
slot_map.o \
suballoc.o \
super.o \
symlink.o \
sysfile.o \
uptodate.o \
ver.o \
vote.o
obj-$(CONFIG_OCFS2_FS) += cluster/
obj-$(CONFIG_OCFS2_FS) += dlm/
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* alloc.h
*
* Function prototypes
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_ALLOC_H
#define OCFS2_ALLOC_H
struct ocfs2_alloc_context;
int ocfs2_insert_extent(struct ocfs2_super *osb,
struct ocfs2_journal_handle *handle,
struct inode *inode,
struct buffer_head *fe_bh,
u64 blkno,
u32 new_clusters,
struct ocfs2_alloc_context *meta_ac);
int ocfs2_num_free_extents(struct ocfs2_super *osb,
struct inode *inode,
struct ocfs2_dinode *fe);
/* how many new metadata chunks would an allocation need at maximum? */
static inline int ocfs2_extend_meta_needed(struct ocfs2_dinode *fe)
{
/*
* Rather than do all the work of determining how much we need
* (involves a ton of reads and locks), just ask for the
* maximal limit. That's a tree depth shift. So, one block for
* level of the tree (current l_tree_depth), one block for the
* new tree_depth==0 extent_block, and one block at the new
* top-of-the tree.
*/
return le16_to_cpu(fe->id2.i_list.l_tree_depth) + 2;
}
int ocfs2_truncate_log_init(struct ocfs2_super *osb);
void ocfs2_truncate_log_shutdown(struct ocfs2_super *osb);
void ocfs2_schedule_truncate_log_flush(struct ocfs2_super *osb,
int cancel);
int ocfs2_flush_truncate_log(struct ocfs2_super *osb);
int ocfs2_begin_truncate_log_recovery(struct ocfs2_super *osb,
int slot_num,
struct ocfs2_dinode **tl_copy);
int ocfs2_complete_truncate_log_recovery(struct ocfs2_super *osb,
struct ocfs2_dinode *tl_copy);
struct ocfs2_truncate_context {
struct inode *tc_ext_alloc_inode;
struct buffer_head *tc_ext_alloc_bh;
int tc_ext_alloc_locked; /* is it cluster locked? */
/* these get destroyed once it's passed to ocfs2_commit_truncate. */
struct buffer_head *tc_last_eb_bh;
};
int ocfs2_prepare_truncate(struct ocfs2_super *osb,
struct inode *inode,
struct buffer_head *fe_bh,
struct ocfs2_truncate_context **tc);
int ocfs2_commit_truncate(struct ocfs2_super *osb,
struct inode *inode,
struct buffer_head *fe_bh,
struct ocfs2_truncate_context *tc);
#endif /* OCFS2_ALLOC_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* Copyright (C) 2002, 2004, 2005 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_AOPS_H
#define OCFS2_AOPS_H
int ocfs2_prepare_write(struct file *file, struct page *page,
unsigned from, unsigned to);
struct ocfs2_journal_handle *ocfs2_start_walk_page_trans(struct inode *inode,
struct page *page,
unsigned from,
unsigned to);
/* all ocfs2_dio_end_io()'s fault */
#define ocfs2_iocb_is_rw_locked(iocb) \
test_bit(0, (unsigned long *)&iocb->private)
#define ocfs2_iocb_set_rw_locked(iocb) \
set_bit(0, (unsigned long *)&iocb->private)
#define ocfs2_iocb_clear_rw_locked(iocb) \
clear_bit(0, (unsigned long *)&iocb->private)
#endif /* OCFS2_FILE_H */
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* io.c
*
* Buffer cache handling
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/highmem.h>
#include <cluster/masklog.h>
#include "ocfs2.h"
#include "alloc.h"
#include "inode.h"
#include "journal.h"
#include "uptodate.h"
#include "buffer_head_io.h"
int ocfs2_write_block(struct ocfs2_super *osb, struct buffer_head *bh,
struct inode *inode)
{
int ret = 0;
mlog_entry("(bh->b_blocknr = %llu, inode=%p)\n",
(unsigned long long)bh->b_blocknr, inode);
BUG_ON(bh->b_blocknr < OCFS2_SUPER_BLOCK_BLKNO);
BUG_ON(buffer_jbd(bh));
/* No need to check for a soft readonly file system here. non
* journalled writes are only ever done on system files which
* can get modified during recovery even if read-only. */
if (ocfs2_is_hard_readonly(osb)) {
ret = -EROFS;
goto out;
}
down(&OCFS2_I(inode)->ip_io_sem);
lock_buffer(bh);
set_buffer_uptodate(bh);
/* remove from dirty list before I/O. */
clear_buffer_dirty(bh);
get_bh(bh); /* for end_buffer_write_sync() */
bh->b_end_io = end_buffer_write_sync;
submit_bh(WRITE, bh);
wait_on_buffer(bh);
if (buffer_uptodate(bh)) {
ocfs2_set_buffer_uptodate(inode, bh);
} else {
/* We don't need to remove the clustered uptodate
* information for this bh as it's not marked locally
* uptodate. */
ret = -EIO;
brelse(bh);
}
up(&OCFS2_I(inode)->ip_io_sem);
out:
mlog_exit(ret);
return ret;
}
int ocfs2_read_blocks(struct ocfs2_super *osb, u64 block, int nr,
struct buffer_head *bhs[], int flags,
struct inode *inode)
{
int status = 0;
struct super_block *sb;
int i, ignore_cache = 0;
struct buffer_head *bh;
mlog_entry("(block=(%"MLFu64"), nr=(%d), flags=%d, inode=%p)\n",
block, nr, flags, inode);
if (osb == NULL || osb->sb == NULL || bhs == NULL) {
status = -EINVAL;
mlog_errno(status);
goto bail;
}
if (nr < 0) {
mlog(ML_ERROR, "asked to read %d blocks!\n", nr);
status = -EINVAL;
mlog_errno(status);
goto bail;
}
if (nr == 0) {
mlog(ML_BH_IO, "No buffers will be read!\n");
status = 0;
goto bail;
}
sb = osb->sb;
if (flags & OCFS2_BH_CACHED && !inode)
flags &= ~OCFS2_BH_CACHED;
if (inode)
down(&OCFS2_I(inode)->ip_io_sem);
for (i = 0 ; i < nr ; i++) {
if (bhs[i] == NULL) {
bhs[i] = sb_getblk(sb, block++);
if (bhs[i] == NULL) {
if (inode)
up(&OCFS2_I(inode)->ip_io_sem);
status = -EIO;
mlog_errno(status);
goto bail;
}
}
bh = bhs[i];
ignore_cache = 0;
if (flags & OCFS2_BH_CACHED &&
!ocfs2_buffer_uptodate(inode, bh)) {
mlog(ML_UPTODATE,
"bh (%llu), inode %"MLFu64" not uptodate\n",
(unsigned long long)bh->b_blocknr,
OCFS2_I(inode)->ip_blkno);
ignore_cache = 1;
}
/* XXX: Can we ever get this and *not* have the cached
* flag set? */
if (buffer_jbd(bh)) {
if (!(flags & OCFS2_BH_CACHED) || ignore_cache)
mlog(ML_BH_IO, "trying to sync read a jbd "
"managed bh (blocknr = %llu)\n",
(unsigned long long)bh->b_blocknr);
continue;
}
if (!(flags & OCFS2_BH_CACHED) || ignore_cache) {
if (buffer_dirty(bh)) {
/* This should probably be a BUG, or
* at least return an error. */
mlog(ML_BH_IO, "asking me to sync read a dirty "
"buffer! (blocknr = %llu)\n",
(unsigned long long)bh->b_blocknr);
continue;
}
lock_buffer(bh);
if (buffer_jbd(bh)) {
#ifdef CATCH_BH_JBD_RACES
mlog(ML_ERROR, "block %llu had the JBD bit set "
"while I was in lock_buffer!",
(unsigned long long)bh->b_blocknr);
BUG();
#else
unlock_buffer(bh);
continue;
#endif
}
clear_buffer_uptodate(bh);
get_bh(bh); /* for end_buffer_read_sync() */
bh->b_end_io = end_buffer_read_sync;
if (flags & OCFS2_BH_READAHEAD)
submit_bh(READA, bh);
else
submit_bh(READ, bh);
continue;
}
}
status = 0;
for (i = (nr - 1); i >= 0; i--) {
bh = bhs[i];
/* We know this can't have changed as we hold the
* inode sem. Avoid doing any work on the bh if the
* journal has it. */
if (!buffer_jbd(bh))
wait_on_buffer(bh);
if (!buffer_uptodate(bh)) {
/* Status won't be cleared from here on out,
* so we can safely record this and loop back
* to cleanup the other buffers. Don't need to
* remove the clustered uptodate information
* for this bh as it's not marked locally
* uptodate. */
status = -EIO;
brelse(bh);
bhs[i] = NULL;
continue;
}
if (inode)
ocfs2_set_buffer_uptodate(inode, bh);
}
if (inode)
up(&OCFS2_I(inode)->ip_io_sem);
mlog(ML_BH_IO, "block=(%"MLFu64"), nr=(%d), cached=%s\n", block, nr,
(!(flags & OCFS2_BH_CACHED) || ignore_cache) ? "no" : "yes");
bail:
mlog_exit(status);
return status;
}
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* ocfs2_buffer_head.h
*
* Buffer cache handling functions defined
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_BUFFER_HEAD_IO_H
#define OCFS2_BUFFER_HEAD_IO_H
#include <linux/buffer_head.h>
void ocfs2_end_buffer_io_sync(struct buffer_head *bh,
int uptodate);
static inline int ocfs2_read_block(struct ocfs2_super *osb,
u64 off,
struct buffer_head **bh,
int flags,
struct inode *inode);
int ocfs2_write_block(struct ocfs2_super *osb,
struct buffer_head *bh,
struct inode *inode);
int ocfs2_read_blocks(struct ocfs2_super *osb,
u64 block,
int nr,
struct buffer_head *bhs[],
int flags,
struct inode *inode);
#define OCFS2_BH_CACHED 1
#define OCFS2_BH_READAHEAD 8 /* use this to pass READA down to submit_bh */
static inline int ocfs2_read_block(struct ocfs2_super * osb, u64 off,
struct buffer_head **bh, int flags,
struct inode *inode)
{
int status = 0;
if (bh == NULL) {
printk("ocfs2: bh == NULL\n");
status = -EINVAL;
goto bail;
}
status = ocfs2_read_blocks(osb, off, 1, bh,
flags, inode);
bail:
return status;
}
#endif /* OCFS2_BUFFER_HEAD_IO_H */
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* dcache.c
*
* dentry cache handling code
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/namei.h>
#define MLOG_MASK_PREFIX ML_DCACHE
#include <cluster/masklog.h>
#include "ocfs2.h"
#include "alloc.h"
#include "dcache.h"
#include "file.h"
#include "inode.h"
static int ocfs2_dentry_revalidate(struct dentry *dentry,
struct nameidata *nd)
{
struct inode *inode = dentry->d_inode;
int ret = 0; /* if all else fails, just return false */
struct ocfs2_super *osb;
mlog_entry("(0x%p, '%.*s')\n", dentry,
dentry->d_name.len, dentry->d_name.name);
/* Never trust a negative dentry - force a new lookup. */
if (inode == NULL) {
mlog(0, "negative dentry: %.*s\n", dentry->d_name.len,
dentry->d_name.name);
goto bail;
}
osb = OCFS2_SB(inode->i_sb);
BUG_ON(!osb);
if (inode != osb->root_inode) {
spin_lock(&OCFS2_I(inode)->ip_lock);
/* did we or someone else delete this inode? */
if (OCFS2_I(inode)->ip_flags & OCFS2_INODE_DELETED) {
spin_unlock(&OCFS2_I(inode)->ip_lock);
mlog(0, "inode (%"MLFu64") deleted, returning false\n",
OCFS2_I(inode)->ip_blkno);
goto bail;
}
spin_unlock(&OCFS2_I(inode)->ip_lock);
if (!inode->i_nlink) {
mlog(0, "Inode %"MLFu64" orphaned, returning false "
"dir = %d\n", OCFS2_I(inode)->ip_blkno,
S_ISDIR(inode->i_mode));
goto bail;
}
}
ret = 1;
bail:
mlog_exit(ret);
return ret;
}
struct dentry_operations ocfs2_dentry_ops = {
.d_revalidate = ocfs2_dentry_revalidate,
};
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* dcache.h
*
* Function prototypes
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_DCACHE_H
#define OCFS2_DCACHE_H
extern struct dentry_operations ocfs2_dentry_ops;
#endif /* OCFS2_DCACHE_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* dir.h
*
* Function prototypes
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_DIR_H
#define OCFS2_DIR_H
int ocfs2_check_dir_for_entry(struct inode *dir,
const char *name,
int namelen);
int ocfs2_empty_dir(struct inode *inode); /* FIXME: to namei.c */
int ocfs2_find_files_on_disk(const char *name,
int namelen,
u64 *blkno,
struct inode *inode,
struct buffer_head **dirent_bh,
struct ocfs2_dir_entry **dirent);
int ocfs2_readdir(struct file *filp, void *dirent, filldir_t filldir);
int ocfs2_prepare_dir_for_insert(struct ocfs2_super *osb,
struct inode *dir,
struct buffer_head *parent_fe_bh,
const char *name,
int namelen,
struct buffer_head **ret_de_bh);
struct ocfs2_alloc_context;
int ocfs2_do_extend_dir(struct super_block *sb,
struct ocfs2_journal_handle *handle,
struct inode *dir,
struct buffer_head *parent_fe_bh,
struct ocfs2_alloc_context *data_ac,
struct ocfs2_alloc_context *meta_ac,
struct buffer_head **new_bh);
#endif /* OCFS2_DIR_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* dlmglue.h
*
* description here
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef DLMGLUE_H
#define DLMGLUE_H
#define OCFS2_LVB_VERSION 2
struct ocfs2_meta_lvb {
__be32 lvb_version;
__be32 lvb_iclusters;
__be32 lvb_iuid;
__be32 lvb_igid;
__be64 lvb_iatime_packed;
__be64 lvb_ictime_packed;
__be64 lvb_imtime_packed;
__be64 lvb_isize;
__be16 lvb_imode;
__be16 lvb_inlink;
__be32 lvb_reserved[3];
};
/* ocfs2_meta_lock_full() and ocfs2_data_lock_full() 'arg_flags' flags */
/* don't wait on recovery. */
#define OCFS2_META_LOCK_RECOVERY (0x01)
/* Instruct the dlm not to queue ourselves on the other node. */
#define OCFS2_META_LOCK_NOQUEUE (0x02)
/* don't block waiting for the vote thread, instead return -EAGAIN */
#define OCFS2_LOCK_NONBLOCK (0x04)
int ocfs2_dlm_init(struct ocfs2_super *osb);
void ocfs2_dlm_shutdown(struct ocfs2_super *osb);
void ocfs2_lock_res_init_once(struct ocfs2_lock_res *res);
void ocfs2_inode_lock_res_init(struct ocfs2_lock_res *res,
enum ocfs2_lock_type type,
struct inode *inode);
void ocfs2_lock_res_free(struct ocfs2_lock_res *res);
int ocfs2_create_new_inode_locks(struct inode *inode);
int ocfs2_drop_inode_locks(struct inode *inode);
int ocfs2_data_lock_full(struct inode *inode,
int write,
int arg_flags);
#define ocfs2_data_lock(inode, write) ocfs2_data_lock_full(inode, write, 0)
int ocfs2_data_lock_with_page(struct inode *inode,
int write,
struct page *page);
void ocfs2_data_unlock(struct inode *inode,
int write);
int ocfs2_rw_lock(struct inode *inode, int write);
void ocfs2_rw_unlock(struct inode *inode, int write);
int ocfs2_meta_lock_full(struct inode *inode,
struct ocfs2_journal_handle *handle,
struct buffer_head **ret_bh,
int ex,
int arg_flags);
int ocfs2_meta_lock_with_page(struct inode *inode,
struct ocfs2_journal_handle *handle,
struct buffer_head **ret_bh,
int ex,
struct page *page);
/* 99% of the time we don't want to supply any additional flags --
* those are for very specific cases only. */
#define ocfs2_meta_lock(i, h, b, e) ocfs2_meta_lock_full(i, h, b, e, 0)
void ocfs2_meta_unlock(struct inode *inode,
int ex);
int ocfs2_super_lock(struct ocfs2_super *osb,
int ex);
void ocfs2_super_unlock(struct ocfs2_super *osb,
int ex);
int ocfs2_rename_lock(struct ocfs2_super *osb);
void ocfs2_rename_unlock(struct ocfs2_super *osb);
void ocfs2_mark_lockres_freeing(struct ocfs2_lock_res *lockres);
/* for the vote thread */
void ocfs2_process_blocked_lock(struct ocfs2_super *osb,
struct ocfs2_lock_res *lockres);
struct ocfs2_dlm_debug *ocfs2_new_dlm_debug(void);
void ocfs2_put_dlm_debug(struct ocfs2_dlm_debug *dlm_debug);
/* aids in debugging and tracking lvbs */
void ocfs2_dump_meta_lvb_info(u64 level,
const char *function,
unsigned int line,
struct ocfs2_lock_res *lockres);
#define mlog_meta_lvb(__level, __lockres) ocfs2_dump_meta_lvb_info(__level, __PRETTY_FUNCTION__, __LINE__, __lockres)
#endif /* DLMGLUE_H */
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* Copyright (C) 2005 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_ENDIAN_H
#define OCFS2_ENDIAN_H
static inline void le16_add_cpu(__le16 *var, u16 val)
{
*var = cpu_to_le16(le16_to_cpu(*var) + val);
}
static inline void le32_add_cpu(__le32 *var, u32 val)
{
*var = cpu_to_le32(le32_to_cpu(*var) + val);
}
static inline void le32_and_cpu(__le32 *var, u32 val)
{
*var = cpu_to_le32(le32_to_cpu(*var) & val);
}
static inline void be32_add_cpu(__be32 *var, u32 val)
{
*var = cpu_to_be32(be32_to_cpu(*var) + val);
}
#endif /* OCFS2_ENDIAN_H */
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* export.c
*
* Functions to facilitate NFS exporting
*
* Copyright (C) 2002, 2005 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/types.h>
#define MLOG_MASK_PREFIX ML_EXPORT
#include <cluster/masklog.h>
#include "ocfs2.h"
#include "dir.h"
#include "dlmglue.h"
#include "export.h"
#include "inode.h"
#include "buffer_head_io.h"
struct ocfs2_inode_handle
{
u64 ih_blkno;
u32 ih_generation;
};
static struct dentry *ocfs2_get_dentry(struct super_block *sb, void *vobjp)
{
struct ocfs2_inode_handle *handle = vobjp;
struct inode *inode;
struct dentry *result;
mlog_entry("(0x%p, 0x%p)\n", sb, handle);
if (handle->ih_blkno == 0) {
mlog_errno(-ESTALE);
return ERR_PTR(-ESTALE);
}
inode = ocfs2_iget(OCFS2_SB(sb), handle->ih_blkno);
if (IS_ERR(inode)) {
mlog_errno(PTR_ERR(inode));
return (void *)inode;
}
if (handle->ih_generation != inode->i_generation) {
iput(inode);
mlog_errno(-ESTALE);
return ERR_PTR(-ESTALE);
}
result = d_alloc_anon(inode);
if (!result) {
iput(inode);
mlog_errno(-ENOMEM);
return ERR_PTR(-ENOMEM);
}
mlog_exit_ptr(result);
return result;
}
static struct dentry *ocfs2_get_parent(struct dentry *child)
{
int status;
u64 blkno;
struct dentry *parent;
struct inode *inode;
struct inode *dir = child->d_inode;
struct buffer_head *dirent_bh = NULL;
struct ocfs2_dir_entry *dirent;
mlog_entry("(0x%p, '%.*s')\n", child,
child->d_name.len, child->d_name.name);
mlog(0, "find parent of directory %"MLFu64"\n",
OCFS2_I(dir)->ip_blkno);
status = ocfs2_meta_lock(dir, NULL, NULL, 0);
if (status < 0) {
if (status != -ENOENT)
mlog_errno(status);
parent = ERR_PTR(status);
goto bail;
}
status = ocfs2_find_files_on_disk("..", 2, &blkno, dir, &dirent_bh,
&dirent);
if (status < 0) {
parent = ERR_PTR(-ENOENT);
goto bail_unlock;
}
inode = ocfs2_iget(OCFS2_SB(dir->i_sb), blkno);
if (IS_ERR(inode)) {
mlog(ML_ERROR, "Unable to create inode %"MLFu64"\n", blkno);
parent = ERR_PTR(-EACCES);
goto bail_unlock;
}
parent = d_alloc_anon(inode);
if (!parent) {
iput(inode);
parent = ERR_PTR(-ENOMEM);
}
bail_unlock:
ocfs2_meta_unlock(dir, 0);
if (dirent_bh)
brelse(dirent_bh);
bail:
mlog_exit_ptr(parent);
return parent;
}
static int ocfs2_encode_fh(struct dentry *dentry, __be32 *fh, int *max_len,
int connectable)
{
struct inode *inode = dentry->d_inode;
int len = *max_len;
int type = 1;
u64 blkno;
u32 generation;
mlog_entry("(0x%p, '%.*s', 0x%p, %d, %d)\n", dentry,
dentry->d_name.len, dentry->d_name.name,
fh, len, connectable);
if (len < 3 || (connectable && len < 6)) {
mlog(ML_ERROR, "fh buffer is too small for encoding\n");
type = 255;
goto bail;
}
blkno = OCFS2_I(inode)->ip_blkno;
generation = inode->i_generation;
mlog(0, "Encoding fh: blkno: %"MLFu64", generation: %u\n",
blkno, generation);
len = 3;
fh[0] = cpu_to_le32((u32)(blkno >> 32));
fh[1] = cpu_to_le32((u32)(blkno & 0xffffffff));
fh[2] = cpu_to_le32(generation);
if (connectable && !S_ISDIR(inode->i_mode)) {
struct inode *parent;
spin_lock(&dentry->d_lock);
parent = dentry->d_parent->d_inode;
blkno = OCFS2_I(parent)->ip_blkno;
generation = parent->i_generation;
fh[3] = cpu_to_le32((u32)(blkno >> 32));
fh[4] = cpu_to_le32((u32)(blkno & 0xffffffff));
fh[5] = cpu_to_le32(generation);
spin_unlock(&dentry->d_lock);
len = 6;
type = 2;
mlog(0, "Encoding parent: blkno: %"MLFu64", generation: %u\n",
blkno, generation);
}
*max_len = len;
bail:
mlog_exit(type);
return type;
}
static struct dentry *ocfs2_decode_fh(struct super_block *sb, __be32 *fh,
int fh_len, int fileid_type,
int (*acceptable)(void *context,
struct dentry *de),
void *context)
{
struct ocfs2_inode_handle handle, parent;
struct dentry *ret = NULL;
mlog_entry("(0x%p, 0x%p, %d, %d, 0x%p, 0x%p)\n",
sb, fh, fh_len, fileid_type, acceptable, context);
if (fh_len < 3 || fileid_type > 2)
goto bail;
if (fileid_type == 2) {
if (fh_len < 6)
goto bail;
parent.ih_blkno = (u64)le32_to_cpu(fh[3]) << 32;
parent.ih_blkno |= (u64)le32_to_cpu(fh[4]);
parent.ih_generation = le32_to_cpu(fh[5]);
mlog(0, "Decoding parent: blkno: %"MLFu64", generation: %u\n",
parent.ih_blkno, parent.ih_generation);
}
handle.ih_blkno = (u64)le32_to_cpu(fh[0]) << 32;
handle.ih_blkno |= (u64)le32_to_cpu(fh[1]);
handle.ih_generation = le32_to_cpu(fh[2]);
mlog(0, "Encoding fh: blkno: %"MLFu64", generation: %u\n",
handle.ih_blkno, handle.ih_generation);
ret = ocfs2_export_ops.find_exported_dentry(sb, &handle, &parent,
acceptable, context);
bail:
mlog_exit_ptr(ret);
return ret;
}
struct export_operations ocfs2_export_ops = {
.decode_fh = ocfs2_decode_fh,
.encode_fh = ocfs2_encode_fh,
.get_parent = ocfs2_get_parent,
.get_dentry = ocfs2_get_dentry,
};
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* export.h
*
* Function prototypes
*
* Copyright (C) 2002, 2005 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_EXPORT_H
#define OCFS2_EXPORT_H
extern struct export_operations ocfs2_export_ops;
#endif /* OCFS2_EXPORT_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* extent_map.h
*
* In-memory file extent mappings for OCFS2.
*
* Copyright (C) 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License, version 2, as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef _EXTENT_MAP_H
#define _EXTENT_MAP_H
int init_ocfs2_extent_maps(void);
void exit_ocfs2_extent_maps(void);
/*
* EVERY CALL here except _init, _trunc, and _drop expects alloc_sem
* to be held. The allocation cannot change at all while the map is
* in the process of being updated.
*/
int ocfs2_extent_map_init(struct inode *inode);
int ocfs2_extent_map_append(struct inode *inode,
struct ocfs2_extent_rec *rec,
u32 new_clusters);
int ocfs2_extent_map_get_blocks(struct inode *inode,
u64 v_blkno, int count,
u64 *p_blkno, int *ret_count);
int ocfs2_extent_map_drop(struct inode *inode, u32 new_clusters);
int ocfs2_extent_map_trunc(struct inode *inode, u32 new_clusters);
#endif /* _EXTENT_MAP_H */
This diff is collapsed.
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* file.h
*
* Function prototypes
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_FILE_H
#define OCFS2_FILE_H
extern struct file_operations ocfs2_fops;
extern struct file_operations ocfs2_dops;
extern struct inode_operations ocfs2_file_iops;
extern struct inode_operations ocfs2_special_file_iops;
struct ocfs2_alloc_context;
enum ocfs2_alloc_restarted {
RESTART_NONE = 0,
RESTART_TRANS,
RESTART_META
};
int ocfs2_do_extend_allocation(struct ocfs2_super *osb,
struct inode *inode,
u32 clusters_to_add,
struct buffer_head *fe_bh,
struct ocfs2_journal_handle *handle,
struct ocfs2_alloc_context *data_ac,
struct ocfs2_alloc_context *meta_ac,
enum ocfs2_alloc_restarted *reason);
int ocfs2_setattr(struct dentry *dentry, struct iattr *attr);
int ocfs2_getattr(struct vfsmount *mnt, struct dentry *dentry,
struct kstat *stat);
int ocfs2_set_inode_size(struct ocfs2_journal_handle *handle,
struct inode *inode,
struct buffer_head *fe_bh,
u64 new_i_size);
#endif /* OCFS2_FILE_H */
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* heartbeat.c
*
* Register ourselves with the heartbaet service, keep our node maps
* up to date, and fire off recovery when needed.
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#include <linux/fs.h>
#include <linux/types.h>
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/kmod.h>
#include <cluster/heartbeat.h>
#include <cluster/nodemanager.h>
#include <dlm/dlmapi.h>
#define MLOG_MASK_PREFIX ML_SUPER
#include <cluster/masklog.h>
#include "ocfs2.h"
#include "alloc.h"
#include "heartbeat.h"
#include "inode.h"
#include "journal.h"
#include "vote.h"
#include "buffer_head_io.h"
#define OCFS2_HB_NODE_DOWN_PRI (0x0000002)
#define OCFS2_HB_NODE_UP_PRI OCFS2_HB_NODE_DOWN_PRI
static inline void __ocfs2_node_map_set_bit(struct ocfs2_node_map *map,
int bit);
static inline void __ocfs2_node_map_clear_bit(struct ocfs2_node_map *map,
int bit);
static inline int __ocfs2_node_map_is_empty(struct ocfs2_node_map *map);
static void __ocfs2_node_map_dup(struct ocfs2_node_map *target,
struct ocfs2_node_map *from);
static void __ocfs2_node_map_set(struct ocfs2_node_map *target,
struct ocfs2_node_map *from);
void ocfs2_init_node_maps(struct ocfs2_super *osb)
{
spin_lock_init(&osb->node_map_lock);
ocfs2_node_map_init(&osb->mounted_map);
ocfs2_node_map_init(&osb->recovery_map);
ocfs2_node_map_init(&osb->umount_map);
}
static void ocfs2_do_node_down(int node_num,
struct ocfs2_super *osb)
{
BUG_ON(osb->node_num == node_num);
mlog(0, "ocfs2: node down event for %d\n", node_num);
if (!osb->dlm) {
/*
* No DLM means we're not even ready to participate yet.
* We check the slots after the DLM comes up, so we will
* notice the node death then. We can safely ignore it
* here.
*/
return;
}
if (ocfs2_node_map_test_bit(osb, &osb->umount_map, node_num)) {
/* If a node is in the umount map, then we've been
* expecting him to go down and we know ahead of time
* that recovery is not necessary. */
ocfs2_node_map_clear_bit(osb, &osb->umount_map, node_num);
return;
}
ocfs2_recovery_thread(osb, node_num);
ocfs2_remove_node_from_vote_queues(osb, node_num);
}
static void ocfs2_hb_node_down_cb(struct o2nm_node *node,
int node_num,
void *data)
{
ocfs2_do_node_down(node_num, (struct ocfs2_super *) data);
}
/* Called from the dlm when it's about to evict a node. We may also
* get a heartbeat callback later. */
static void ocfs2_dlm_eviction_cb(int node_num,
void *data)
{
struct ocfs2_super *osb = (struct ocfs2_super *) data;
struct super_block *sb = osb->sb;
mlog(ML_NOTICE, "device (%u,%u): dlm has evicted node %d\n",
MAJOR(sb->s_dev), MINOR(sb->s_dev), node_num);
ocfs2_do_node_down(node_num, osb);
}
static void ocfs2_hb_node_up_cb(struct o2nm_node *node,
int node_num,
void *data)
{
struct ocfs2_super *osb = data;
BUG_ON(osb->node_num == node_num);
mlog(0, "node up event for %d\n", node_num);
ocfs2_node_map_clear_bit(osb, &osb->umount_map, node_num);
}
void ocfs2_setup_hb_callbacks(struct ocfs2_super *osb)
{
o2hb_setup_callback(&osb->osb_hb_down, O2HB_NODE_DOWN_CB,
ocfs2_hb_node_down_cb, osb,
OCFS2_HB_NODE_DOWN_PRI);
o2hb_setup_callback(&osb->osb_hb_up, O2HB_NODE_UP_CB,
ocfs2_hb_node_up_cb, osb, OCFS2_HB_NODE_UP_PRI);
/* Not exactly a heartbeat callback, but leads to essentially
* the same path so we set it up here. */
dlm_setup_eviction_cb(&osb->osb_eviction_cb,
ocfs2_dlm_eviction_cb,
osb);
}
/* Most functions here are just stubs for now... */
int ocfs2_register_hb_callbacks(struct ocfs2_super *osb)
{
int status;
status = o2hb_register_callback(&osb->osb_hb_down);
if (status < 0) {
mlog_errno(status);
goto bail;
}
status = o2hb_register_callback(&osb->osb_hb_up);
if (status < 0)
mlog_errno(status);
bail:
return status;
}
void ocfs2_clear_hb_callbacks(struct ocfs2_super *osb)
{
int status;
status = o2hb_unregister_callback(&osb->osb_hb_down);
if (status < 0)
mlog_errno(status);
status = o2hb_unregister_callback(&osb->osb_hb_up);
if (status < 0)
mlog_errno(status);
}
void ocfs2_stop_heartbeat(struct ocfs2_super *osb)
{
int ret;
char *argv[5], *envp[3];
if (!osb->uuid_str) {
/* This can happen if we don't get far enough in mount... */
mlog(0, "No UUID with which to stop heartbeat!\n\n");
return;
}
argv[0] = (char *)o2nm_get_hb_ctl_path();
argv[1] = "-K";
argv[2] = "-u";
argv[3] = osb->uuid_str;
argv[4] = NULL;
mlog(0, "Run: %s %s %s %s\n", argv[0], argv[1], argv[2], argv[3]);
/* minimal command environment taken from cpu_run_sbin_hotplug */
envp[0] = "HOME=/";
envp[1] = "PATH=/sbin:/bin:/usr/sbin:/usr/bin";
envp[2] = NULL;
ret = call_usermodehelper(argv[0], argv, envp, 1);
if (ret < 0)
mlog_errno(ret);
}
/* special case -1 for now
* TODO: should *really* make sure the calling func never passes -1!! */
void ocfs2_node_map_init(struct ocfs2_node_map *map)
{
map->num_nodes = OCFS2_NODE_MAP_MAX_NODES;
memset(map->map, 0, BITS_TO_LONGS(OCFS2_NODE_MAP_MAX_NODES) *
sizeof(unsigned long));
}
static inline void __ocfs2_node_map_set_bit(struct ocfs2_node_map *map,
int bit)
{
set_bit(bit, map->map);
}
void ocfs2_node_map_set_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int bit)
{
if (bit==-1)
return;
BUG_ON(bit >= map->num_nodes);
spin_lock(&osb->node_map_lock);
__ocfs2_node_map_set_bit(map, bit);
spin_unlock(&osb->node_map_lock);
}
static inline void __ocfs2_node_map_clear_bit(struct ocfs2_node_map *map,
int bit)
{
clear_bit(bit, map->map);
}
void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int bit)
{
if (bit==-1)
return;
BUG_ON(bit >= map->num_nodes);
spin_lock(&osb->node_map_lock);
__ocfs2_node_map_clear_bit(map, bit);
spin_unlock(&osb->node_map_lock);
}
int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int bit)
{
int ret;
if (bit >= map->num_nodes) {
mlog(ML_ERROR, "bit=%d map->num_nodes=%d\n", bit, map->num_nodes);
BUG();
}
spin_lock(&osb->node_map_lock);
ret = test_bit(bit, map->map);
spin_unlock(&osb->node_map_lock);
return ret;
}
static inline int __ocfs2_node_map_is_empty(struct ocfs2_node_map *map)
{
int bit;
bit = find_next_bit(map->map, map->num_nodes, 0);
if (bit < map->num_nodes)
return 0;
return 1;
}
int ocfs2_node_map_is_empty(struct ocfs2_super *osb,
struct ocfs2_node_map *map)
{
int ret;
BUG_ON(map->num_nodes == 0);
spin_lock(&osb->node_map_lock);
ret = __ocfs2_node_map_is_empty(map);
spin_unlock(&osb->node_map_lock);
return ret;
}
static void __ocfs2_node_map_dup(struct ocfs2_node_map *target,
struct ocfs2_node_map *from)
{
BUG_ON(from->num_nodes == 0);
ocfs2_node_map_init(target);
__ocfs2_node_map_set(target, from);
}
/* returns 1 if bit is the only bit set in target, 0 otherwise */
int ocfs2_node_map_is_only(struct ocfs2_super *osb,
struct ocfs2_node_map *target,
int bit)
{
struct ocfs2_node_map temp;
int ret;
spin_lock(&osb->node_map_lock);
__ocfs2_node_map_dup(&temp, target);
__ocfs2_node_map_clear_bit(&temp, bit);
ret = __ocfs2_node_map_is_empty(&temp);
spin_unlock(&osb->node_map_lock);
return ret;
}
static void __ocfs2_node_map_set(struct ocfs2_node_map *target,
struct ocfs2_node_map *from)
{
int num_longs, i;
BUG_ON(target->num_nodes != from->num_nodes);
BUG_ON(target->num_nodes == 0);
num_longs = BITS_TO_LONGS(target->num_nodes);
for (i = 0; i < num_longs; i++)
target->map[i] = from->map[i];
}
/* Returns whether the recovery bit was actually set - it may not be
* if a node is still marked as needing recovery */
int ocfs2_recovery_map_set(struct ocfs2_super *osb,
int num)
{
int set = 0;
spin_lock(&osb->node_map_lock);
__ocfs2_node_map_clear_bit(&osb->mounted_map, num);
if (!test_bit(num, osb->recovery_map.map)) {
__ocfs2_node_map_set_bit(&osb->recovery_map, num);
set = 1;
}
spin_unlock(&osb->node_map_lock);
return set;
}
void ocfs2_recovery_map_clear(struct ocfs2_super *osb,
int num)
{
ocfs2_node_map_clear_bit(osb, &osb->recovery_map, num);
}
int ocfs2_node_map_iterate(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int idx)
{
int i = idx;
idx = O2NM_INVALID_NODE_NUM;
spin_lock(&osb->node_map_lock);
if ((i != O2NM_INVALID_NODE_NUM) &&
(i >= 0) &&
(i < map->num_nodes)) {
while(i < map->num_nodes) {
if (test_bit(i, map->map)) {
idx = i;
break;
}
i++;
}
}
spin_unlock(&osb->node_map_lock);
return idx;
}
/* -*- mode: c; c-basic-offset: 8; -*-
* vim: noexpandtab sw=8 ts=8 sts=0:
*
* heartbeat.h
*
* Function prototypes
*
* Copyright (C) 2002, 2004 Oracle. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/
#ifndef OCFS2_HEARTBEAT_H
#define OCFS2_HEARTBEAT_H
void ocfs2_init_node_maps(struct ocfs2_super *osb);
void ocfs2_setup_hb_callbacks(struct ocfs2_super *osb);
int ocfs2_register_hb_callbacks(struct ocfs2_super *osb);
void ocfs2_clear_hb_callbacks(struct ocfs2_super *osb);
void ocfs2_stop_heartbeat(struct ocfs2_super *osb);
/* node map functions - used to keep track of mounted and in-recovery
* nodes. */
void ocfs2_node_map_init(struct ocfs2_node_map *map);
int ocfs2_node_map_is_empty(struct ocfs2_super *osb,
struct ocfs2_node_map *map);
void ocfs2_node_map_set_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int bit);
void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int bit);
int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int bit);
int ocfs2_node_map_iterate(struct ocfs2_super *osb,
struct ocfs2_node_map *map,
int idx);
static inline int ocfs2_node_map_first_set_bit(struct ocfs2_super *osb,
struct ocfs2_node_map *map)
{
return ocfs2_node_map_iterate(osb, map, 0);
}
int ocfs2_recovery_map_set(struct ocfs2_super *osb,
int num);
void ocfs2_recovery_map_clear(struct ocfs2_super *osb,
int num);
/* returns 1 if bit is the only bit set in target, 0 otherwise */
int ocfs2_node_map_is_only(struct ocfs2_super *osb,
struct ocfs2_node_map *target,
int bit);
#endif /* OCFS2_HEARTBEAT_H */
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment