git: hammer2 - Rewrite internal chain algorithms

Matthew Dillon dillon at
Fri Sep 27 08:16:24 PDT 2013

commit 1897c66ea31ca9d7b3fb03ba08889054bf36ef04
Author: Matthew Dillon <dillon at>
Date:   Fri Sep 27 07:50:53 2013 -0700

    hammer2 - Rewrite internal chain algorithms
    * These are big changes designed to make it easier for the flusher to
      order media writes and as a requirement for upcoming 'copies' and other
      work.  H2 will be somewhat unstable for a short while.
    * Replace the single-layer RBTREE per topological level with a stack of
      RBTREEs per topological level.  This allows conflicts to be stacked
      vertically roughly by flush synchronization group instead of trying
      to arrange everything in a single 2D RBTREE.  Conflicts arise because
      the in-memory tree can represent multiple synchronization points with
      live and deleted-flagged chains.
      The RBTREE compare function is actually simplified by this change, but
      the code must contend with a list of RBTREEs (built as conflicts arise)
      instead of just one.
    * The block table for any given topological layer is now ordered, and this
      will now be reflected and required on-media as well.  That is, it is
      sorted.  It is still effectively set-associative in that the dynamic
      radix tree model is still being used, but elements must now be sorted.
      This improves I/O but more importantly it will allow H2 to use cursors
      to track multiple mounts used in 'copies' and later on in clustering.
      Running multiple media topologies concurrently was impossible when the
      block sets were unordered.
      This makes implementing a merged search (blocktable and in-memory chains)
    * Remove chain->index tracking.  We used to track the block table index
      that each in-memory chain intended to use which worked 'ok' when the
      block table (such as in an inode or an indirect block) was unordered,
      but requires excessive (and unnecessary) coding when the block table
      is ordered.
      Instead of tracking indices we track the total number of 'live' (not
      deleted) elements that will end up in the block table in hammer2_chain_core
      and use that to determine when indirect blocks need to be created or not.
      Block table entries are now only assigned by the flusher itself when it
      determines that an entry must be added or deleted.
    * The flusher now operates more deterministically by scanning the stacked
      RBTREEs backwards (earliest to most recent).  The stacks are not specifically
      broken down by flush group (instead stacking only occurs if there is a
      key range conflict), so the flusher still checks the modify_tid and
      delete_tid for each chain it scans, but there should no longer be any
      chance of media updates being misordered.
      Because block table indices are now assigned by the flusher, each RBTREE
      stack is scanned three times instead of twice.  One for the recursion down
      and two for the finalization on the way back up which updates the block
      table.  The block table scan was broken into two passes to handle deletions
      in the first pass and insertions/updates in the second pass which allows
      the block table to be updated on the fly without having to assign specific
      array indices to chains.
    * Replace the first_parent/next_parent linkages used to deal with multi-homed
      chains (chains with multiple parents which can arise due to renames,
      moves into or out of indirect blocks ,etc).  Use a queue instead.
      Remove the additional ref count used in the old linkage model.  We must
      still ensure that the final 'live' chain does not get removed when its
      refs drops to 0.

Summary of changes:
 sys/vfs/hammer2/hammer2.h         |  125 ++-
 sys/vfs/hammer2/hammer2_chain.c   | 1943 +++++++++++++++++++++----------------
 sys/vfs/hammer2/hammer2_flush.c   |  186 ++--
 sys/vfs/hammer2/hammer2_freemap.c |   10 +-
 sys/vfs/hammer2/hammer2_inode.c   |   69 +-
 sys/vfs/hammer2/hammer2_ioctl.c   |   39 +-
 sys/vfs/hammer2/hammer2_vfsops.c  |   61 +-
 sys/vfs/hammer2/hammer2_vnops.c   |   52 +-
 8 files changed, 1455 insertions(+), 1030 deletions(-)

DragonFly BSD source repository

More information about the Commits mailing list