git: DragonFly_RELEASE_6_0 hammer2 - Fix deadlock and improve performance
Matthew Dillon
dillon at crater.dragonflybsd.org
Mon Jun 14 14:14:14 PDT 2021
commit f88a956bf7a1932a3d0ba68bfe1093853b2305d6
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date: Thu Jun 10 20:18:16 2021 -0700
hammer2 - Fix deadlock and improve performance
* hammer2_xop_start*() is supposed to split the worker threads
into two groups, one for strategy XOPs and the other for non-strategy
XOPs.
However, due to a bug in the code it was *NOT* doing this, possibly
leading to VOP/STRATEGY deadlocks under heavy loads. That is,
XOP worker threads could received a strategy XOP while blocked on a VOP
XOP and deadlock.
* Possibly related to bug reports of systems locking up during
the overnight periodic (which find /'s a lot). This may fix these
issues.
* Also refactor the thread-selection algorithm. Filesystems with
nclusters == 1 (which is all of them at the moment) do not have
to serialize XOPs based on the inode (sending the XOP to another
thread, most likely), and instead they can just push the XOP to a
worker thread on the current cpu.
This significantly reduces IPI signaling and reduces I/O latency,
but at the cost of reduced streaming decompression performance
since the decompression from a single user thread is not distributed
across multiple CPUs.
I might be able to improve this later by explicitly sending read-ahead
strategy XOPs to other CPUs, but for now I just want to create a sane
world.
* In addition, change the number of worker threads per cpu per H2 mount
to at least 4, half of which are dedicated to strategy XOPs and the
other half to VOP XOPs.
This allows strategy XOPs to read-ahead and stream the actual block
I/O a whole lot better, and also allows multiple VOP XOPs issued on the
same CPU (e.g. from several user threads that happened to be scheduled
to the same cpu) to issue I/O and block without serializing.
Finally, note that even though the XOP messaging is doing thread switches,
it is happening between threads on the same CPU which is actually pretty
quick, typically no more than 2uS or so for the round-trip.
Summary of changes:
sys/vfs/hammer2/hammer2.h | 11 ++++-
sys/vfs/hammer2/hammer2_admin.c | 91 +++++++++++++++++++---------------------
sys/vfs/hammer2/hammer2_inode.c | 5 +++
sys/vfs/hammer2/hammer2_vfsops.c | 41 ++++++++++++------
4 files changed, 84 insertions(+), 64 deletions(-)
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/f88a956bf7a1932a3d0ba68bfe1093853b2305d6
--
DragonFly BSD source repository
More information about the Commits
mailing list