git: hammer2 - Fix deadlock and improve performance

Matthew Dillon dillon at crater.dragonflybsd.org
Thu Jun 10 20:31:19 PDT 2021


commit d38955556cc42394a330f93ed9dfd1906a476776
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date:   Thu Jun 10 20:18:16 2021 -0700

    hammer2 - Fix deadlock and improve performance
    
    * hammer2_xop_start*() is supposed to split the worker threads
      into two groups, one for strategy XOPs and the other for non-strategy
      XOPs.
    
      However, due to a bug in the code it was *NOT* doing this, possibly
      leading to VOP/STRATEGY deadlocks under heavy loads.  That is,
      XOP worker threads could received a strategy XOP while blocked on a VOP
      XOP and deadlock.
    
    * Possibly related to bug reports of systems locking up during
      the overnight periodic (which find /'s a lot).  This may fix these
      issues.
    
    * Also refactor the thread-selection algorithm.  Filesystems with
      nclusters == 1 (which is all of them at the moment) do not have
      to serialize XOPs based on the inode (sending the XOP to another
      thread, most likely), and instead they can just push the XOP to a
      worker thread on the current cpu.
    
      This significantly reduces IPI signaling and reduces I/O latency,
      but at the cost of reduced streaming decompression performance
      since the decompression from a single user thread is not distributed
      across multiple CPUs.
    
      I might be able to improve this later by explicitly sending read-ahead
      strategy XOPs to other CPUs, but for now I just want to create a sane
      world.
    
    * In addition, change the number of worker threads per cpu per H2 mount
      to at least 4, half of which are dedicated to strategy XOPs and the
      other half to VOP XOPs.
    
      This allows strategy XOPs to read-ahead and stream the actual block
      I/O a whole lot better, and also allows multiple VOP XOPs issued on the
      same CPU (e.g. from several user threads that happened to be scheduled
      to the same cpu) to issue I/O and block without serializing.
    
      Finally, note that even though the XOP messaging is doing thread switches,
      it is happening between threads on the same CPU which is actually pretty
      quick, typically no more than 2uS or so for the round-trip.

Summary of changes:
 sys/vfs/hammer2/hammer2.h        | 11 ++++-
 sys/vfs/hammer2/hammer2_admin.c  | 91 +++++++++++++++++++---------------------
 sys/vfs/hammer2/hammer2_inode.c  |  5 +++
 sys/vfs/hammer2/hammer2_vfsops.c | 41 ++++++++++++------
 4 files changed, 84 insertions(+), 64 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/d38955556cc42394a330f93ed9dfd1906a476776


-- 
DragonFly BSD source repository


More information about the Commits mailing list