[DragonFlyBSD - Bug #3408] (New) HAMMER2 syncer pipeline wedges under rapid newfs+cpdup loop
afranke
bugtracker-admin at leaf.dragonflybsd.org
Wed May 6 06:57:21 PDT 2026
Issue #3408 has been reported by afranke.
----------------------------------------
Bug #3408: HAMMER2 syncer pipeline wedges under rapid newfs+cpdup loop
http://bugs.dragonflybsd.org/issues/3408
* Author: afranke
* Status: New
* Priority: Normal
* Target version: 6.6
* Start date: 2026-05-06
----------------------------------------
(authored together with Claude)
h2. Summary
A loop that rapidly @newfs_hammer2@'s a fresh PFS, mounts it,
cpdup's the running root into it, unmounts, and repeats — with no
idle time between iterations — wedges the HAMMER2 syncer pipeline
within ~3 iterations.
The wedge is *not a panic*. The system stays alive, sshd works,
unrelated processes are unaffected. cpdup is stuck in @flstik@
(generic kernel dirty-buffer wait at @sys/kern/vfs_bio.c:484@)
and HAMMER2 syncer threads are in @h2twait@ / @h2syndel at . Nothing
makes forward progress; recovery requires power-cycle.
h2. Reproducer
Tested on DragonFly master @ @4f37521524@ with the virtio-modern
PCI series applied. Call chain is HAMMER2/vfs-only; likely
reproducible on stock master.
<pre><code class="sh">
truncate -s 4G /var/scratch.img
vnconfig vn0 /var/scratch.img
while :; do
umount /mnt/scratch 2>/dev/null
sync
newfs_hammer2 -L ROOT /dev/vn0
mount -t hammer2 /dev/vn0 at ROOT /mnt/scratch
cpdup -i0 -x / /mnt/scratch/
done
</code></pre>
Loop wedges in iteration 3 in our test (~2 minutes wall-clock,
~1.7 GB of cpdup'd content). Stress harness attached as
@hammer2-flush-wedge-stress.sh@ — same shape, with logging.
h2. State at wedge
<pre>
--- vmstat ---
r b w fre flt re pi po fr da0 sg0 int sys ctx us sy id
1 1 0 86.3M 0.98M 4.0G 1.8M 0 3.2G 0 0 30.4M 3.11M 21.8M 0 0 100
--- thread states (ps -axww -o pid,state,wchan) ---
B1 h2twait syncer for the LIVE source FS (da0s1d)
B0 h2syndel syncer11
B0 h2idle ~32 worker threads in h2xop-ROOT.* and h2xop-LOCAL.*
D0 flstik cpdup (kernel dirty-buffer wait, vfs_bio.c:484)
</pre>
100% CPU idle, zero disk IO at the captured instant. Cumulative
re/pi/po/fr counters (4 GB recycled, 3.2 GB freed since boot)
suggest heavy reclaim activity preceded the wedge.
@hammer2 dumpchain /mnt/scratch@ while wedged shows hundreds of
dirty chains accumulated, every one marked
@pflags 00006243 = HAMMER2_CHAIN_ALLOCATED | _MODIFIED | _INITIAL@
with @refs=0 at . None of them flushed. @dumpchain@ itself completes
(introspection is not deadlocked at the mutex level), so this is
a livelock / flush-starvation rather than a hard mutex deadlock.
h2. Probable shape of the cycle (speculative)
We did not instrument deeply enough to pin down the exact
cycle; what follows is inference from the captured states above.
cpdup reads from the source PFS (@/@) and writes to a fresh
destination PFS (@/dev/vn0 at ROOT@) — separate volumes. At wedge:
- The source PFS's syncer is in @h2twait@
(@sys/vfs/hammer2/hammer2_admin.c:148@) — transaction wait,
even though source is read-only by cpdup.
- A bucket syncer is in @h2syndel@ (@hammer2_vfsops.c:2755@),
the "looping too hard, brief restart delay" path inside
@hammer2_vfsops_sync at .
- All worker xops threads are idle (@h2idle@).
- cpdup is at the kernel-wide dirty-buffer wait (@flstik@).
The pattern reads as a cross-PFS dependency in the flush path:
the destination's flush can't tick because the source's syncer
is wedged in transaction wait, and the source's syncer can't
clear because of some dependency on the destination.
h2. Workaround
@sync; sleep 60; sync@ after @newfs_hammer2@ and before the
cpdup. Anecdotal across a handful of subsequent iterations: no
wedges with the settle period; ~3-iteration wedge without it.
Not a deterministic mitigation; suggests the race window is
"fresh-PFS state being heavily written before the syncer has
caught up to the prior PFS's teardown".
---Files--------------------------------
hammer2-flush-wedge-stress.sh (3.32 KB)
--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account
More information about the Bugs
mailing list