Commits · c27069e6cfa242a3b84eb3442934c6fe51ee9066 · linux / linux-davinci-2.6.23

26 Jun, 2006 40 commits

ocfs2: continue recovery when a dead node is encountered · c27069e6

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c27069e6

ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks() · 67a18741
Kurt Hackel authored May 01, 2006
```
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
67a18741

ocfs2: dlm_remaster_locks() should never exit without completing · 6a413211

Kurt Hackel authored May 01, 2006

We cannot restart recovery. Once we begin to recover a node, keep the state
of the recovery intact and follow through, regardless of any other node
deaths that may occur.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

6a413211

ocfs2: special case recovery lock in dlmlock_remote() · c8df412e

Kurt Hackel authored May 01, 2006

If the previous master of the recovery lock dies, let calc_usage take it
down completely and let the caller completely redo the dlmlock() call.
Otherwise, there will never be an opportunity to re-master the lockres and
recovery wont be able to progress.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c8df412e

ocfs2: pending mastery asserts and migrations should block each other · 36407488

Kurt Hackel authored May 01, 2006

Use the existing structure for blocking migrations when ASTs are pending to
achieve the same result. If we can catch the assert before it goes on the
wire, just cancel it and let the migration continue.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

36407488

ocfs2: temporarily disable automatic lock migration · c87a9ae7

Kurt Hackel authored May 01, 2006

Now we never change the owner of a lock resource until unmount or node
death. This will be re-enabled once some issues in the algorithm used have
been resolved.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c87a9ae7

ocfs2: do not unconditionally purge the lockres in dlmlock_remote() · 2abaf97e

Kurt Hackel authored May 01, 2006

In dlmlock_remote(), do not call purge_lockres until the lock resource
actually changes. otherwise, the mastery info on the lockres will go away
underneath the caller.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

2abaf97e

ocfs2: increase backoff before waiting for recovery · aa087b84

Kurt Hackel authored May 01, 2006

When mastering non-recovery lock resources, additional time was frequently
needed to allow the disk heartbeat to catch up with the network timeout. the
recovery lock resource is time critical and avoids this path.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

aa087b84

ocfs2: have dlm_pre_master_reco_lockres() ignore dead nodes · f42a100b

Kurt Hackel authored May 01, 2006

Recovery will spin in dlm_pre_master_reco_lockres if we do not ignore
timed-out network responses from dead nodes.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

f42a100b

ocfs2: give the dlm dirty list a reference on the lockres · 6ff06a93

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

6ff06a93

ocfs2: teach dlm_restart_lock_mastery() to wait on recovery · e7e69eb3

Kurt Hackel authored May 01, 2006

Change behavior of dlm_restart_lock_mastery() when a node goes down.  Dump
all responses that have been collected and start over.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

e7e69eb3

ocfs2: gracefully handle stale create_lock messages. · e4eb0368

Kurt Hackel authored May 01, 2006

This is an error on the sending side, so gracefully error out on the
receiving end.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

e4eb0368

ocfs2: update lvb immediately during recovery · ccd8b1f9

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ccd8b1f9

ocfs2: do not send master requests to localhost · 588e0090

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

588e0090

ocfs2: purge lockres' sooner · 8b219809

Kurt Hackel authored May 01, 2006

Immediately purge a lockress that the local node is not the master of.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

8b219809

ocfs2: dump mismatching migrated lvbs before BUG() · 343e26a4

Kurt Hackel authored May 01, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

343e26a4

ocfs2: make dlm recovery finalization 2 stage · 466d1a45

Kurt Hackel authored May 01, 2006

Makes it easier for the recovery process to deal with node death.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

466d1a45

ocfs2: dlm recovery / lockres reference count fix · 69d72b06

Kurt Hackel authored May 01, 2006

Take a reference on lockres structures while they are on the recovery list.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

69d72b06

ocfs2: better error handling during assert master message · a9ee4c8a

Kurt Hackel authored Apr 27, 2006

handle errors during lock assert master by either killing self or other node
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

a9ee4c8a

ocfs2: dump lockres info before we BUG() on a bad reference · a7f90d83
Kurt Hackel authored Apr 27, 2006
```
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
a7f90d83

ocfs2: do LVB puts in place · c0a8520c

Mark Fasheh authored Apr 27, 2006

Don't wait until the AST will be fired to do the LVB copy into the lock
resource.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c0a8520c

ocfs2: mle ref count debugging · aa852354

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

aa852354

ocfs2: allow for an assert message during lock mastery · dc2ed195

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

dc2ed195

ocfs2: take mle reference during migration · 2d1a868c

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

2d1a868c

ocfs2: properly initialize the mle structure · 41b8c8a1

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

41b8c8a1

ocfs2: detach mle from heartbeat events · da01ad05

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

da01ad05

ocfs2: mle ref counting fixes · a2bf0477

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

a2bf0477

ocfs2: better mle debugging · 95883719

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

95883719

ocfs2: clean up recovery related messages · d6dea6e9

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

d6dea6e9

ocfs2: handle network errors during recovery · 29c0fa0f

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

29c0fa0f

ocfs2: only recover one dead node at a time · c3187ce5

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c3187ce5

ocfs2: Better tracking for recovery state changes · ab27eb6f

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

ab27eb6f

ocfs2: Fix empty lvb check · 8bc674cb

Kurt Hackel authored Apr 27, 2006

The check for an empty lvb should check the entire buffer not just the first
byte.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

8bc674cb

ocfs2: fix inverted logic in dlm_is_node_dead · aba9aac7

Kurt Hackel authored Apr 27, 2006

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

aba9aac7

ocfs2: recheck lockres master before sending an unlock request. · 2580a580

Kurt Hackel authored Apr 27, 2006

Recovery may have happened and it may now be mastered locally.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

2580a580

ocfs2: add a small delay after a failed migration · 8d79d088

Kurt Hackel authored Apr 27, 2006

Otherwise we risk starving other threads.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

8d79d088

ocfs2: silence a compile warning in dlm_alloc_pagevec() · 685f1adb
Mark Fasheh authored Mar 23, 2006
```
Reported by Andrew Morton.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
685f1adb

[PATCH] ocfs2: Alloc at least a page for the DLM hash · c8f33b6e

Joel Becker authored Mar 16, 2006

The OCFS2 DLM allocates a number of pages for a hash to lookup locks.
There was a bug where a PAGE_SIZE bigger than the hash size (eg, 64K
pages) would result in zero pages allocated.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c8f33b6e

ocfs2: allocate lockres hash pages in an array · 03d864c0

Daniel Phillips authored Mar 10, 2006

This allows us to have a hash table greater than a single page which greatly
improves dlm performance on some tests.
Signed-off-by: Daniel Phillips <phillips@google.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

03d864c0

ocfs2: inline dlm_lockres_get() · 95c4f581

Mark Fasheh authored Mar 10, 2006

It's called on every lookup so this might help performance a bit.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

95c4f581