git: hammer2 - Try to reduce no-activity stalls during complex flushes
Matthew Dillon
dillon at crater.dragonflybsd.org
Thu Oct 19 23:08:49 PDT 2023
commit caf661fcf8eadefbd5a83af42a3fcb41ac93c805
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date: Thu Oct 19 23:00:44 2023 -0700
hammer2 - Try to reduce no-activity stalls during complex flushes
* Hammer2 keeps track of directory dependencies to maintain
meta-data consistency at flush boundaries. This can cause
issues when heavy simultaneous front-end activity blows out
dirty buffer limits and stalls in 'h2memw'.
These front-end stalls are not supposed to be holding vnodes,
but there do appear to be cases where the backend flusher is
not able to immediately acquire some vnode locks during the flush.
This causes the backend flush to skip that vnode but also
introduce some static delays (rather than becoming cpu-bound).
The backend flush ultimately restarts the flush and tries again.
Situations can develop where the backend also stalls in a
sequence of 'h2syndel' tsleep delays, resulting in zero
cpu activity (frontend is stalled in 'h2memw'), and zero
disk activity (backend is also stalled) for a short period of
time.
* This problem does not lead to permanent deadlocks, however.
H2 is always able to recover.
* Rearrange a 'h2syndel' tsleep() call in the backend flusher.
Instead of tsleep on a per-failed-to-lock-vnode basis, we
now finish flushing the remaining vnodes, then try to wakeup
processes blocked in 'h2memw' on the frontend, and THEN sleep
for a few ticks before restarting.
This is an attempt to close the gap causing these periods of
no-activity.
Summary of changes:
sys/vfs/hammer2/hammer2_vfsops.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/caf661fcf8eadefbd5a83af42a3fcb41ac93c805
--
DragonFly BSD source repository
More information about the Commits
mailing list