git: hammer2 - Stabilization
Matthew Dillon
dillon at crater.dragonflybsd.org
Sat Nov 2 00:23:43 PDT 2013
commit 925e4ad1f897f0b8850fca83169300322133902e
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date: Sat Nov 2 00:06:57 2013 -0700
hammer2 - Stabilization
* Fix heavy cpu use in flush due to a blown recursion which can run down
the same chain many times due to the aliasing of hammer2_chain_core
structures.
The basic problem is that there can be H2 operations running concurrently
with a flush that are not part of the flush. These operations have a
higher transaction id. When situated deep in the tree, they can cause
the flush to repeatedly traverse large portions of the tree that it had
already checked because the recording of the lower flush TID is lower
than the update_tid from the concurrent operations.
* Fix a multitude of flush / concurrent-operations races. The worst of the
lot is related to the situation where a concurrent operation does a
delete-duplicate on a chain containing a block table (which can include
an inode chain) which the flush needs to update. This results in TWO
block tables needing updating relative to different synchronization
points. Essentially, one of the chains is strictly temporary for flush
purposes while the other is the 'real' chain.
For example, if the concurrent operation is adding or deleting elements
from a block table the flush may have to add/delete DIFFERENT elements
for its own view. This requires two different versions of the block table
(one being strictly temporary).
Improper updates of the chain->bref.mirror_tid caused the flush to get
confused and assert on the blocktable not containing the expected dat.
* More concurrent-operations during a flush issues fixed. If a concurrent
operation deletes a chain and the flush needs to fork a 'live' version
of the chain, the flush's version will have a lower transaction id and
must be properly ordered in hammer2_chain_core->ownerq. It was not
being ordered properly.
* Flushes are recursive and to improve concurrency the flush temporarily
unlocks the old parent when diving under a child. This can result in a
race where, due to hammer2_chain_core aliasing the recursion can wrap
around back to the parent.
Detect the case after re-locking the parent on the way back up the tree
and do the right thing.
* Fix handling of the flush block table rollup. Consolidate the call to
modify the parent (so we can adjust the blockrefs after flushing the
children) to a single point.
* Improve flush performance. If a parent is deferred at a higher level
and then encountered again via a shallower path, we now leave it deferred
and do not try to execute it in the shallower path even though the stack
depth is ok, as it will likely become deferred at a lower level anyway.
Check a deleted-chain case early before we recurse. A deleted chain
which is flagged DUPLICATED does not have to recurse as the sub-path
is reachable via some other parent. This significantly improves
performance because there are often a ton of chains in-memory marked
DELETED.
This results in more efficient deferrals.
* Fix adjustments of modify_tid and delete_tid in delete-duplicate
operations, clean up handling of CHAIN_INITIAL, properly transfer
flags in delete-duplicate.
* Fix some gratuitous wakeups in the transaction API.
Summary of changes:
sys/vfs/hammer2/hammer2.h | 9 +-
sys/vfs/hammer2/hammer2_chain.c | 195 ++++++++++++-----
sys/vfs/hammer2/hammer2_flush.c | 437 +++++++++++++++++++++++++--------------
sys/vfs/hammer2/hammer2_vfsops.c | 22 +-
sys/vfs/hammer2/hammer2_vnops.c | 14 +-
5 files changed, 454 insertions(+), 223 deletions(-)
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/925e4ad1f897f0b8850fca83169300322133902e
--
DragonFly BSD source repository
More information about the Commits
mailing list