[issue1556] many processes stuck in "hmrrcm", system unusable
Matthew Dillon
dillon at apollo.backplane.com
Wed Oct 7 10:59:15 PDT 2009
What we have here is a situation where corecode's xterm+shell startup
is accessing somewhere north of 900 files for various reasons. Big
programs with many shared libraries are getting run. If those
files get knocked out of the cache the startup is going to be slow.
This is what is happening.
HAMMER v2 is better at doing directory lookups but most of the time
seems to be spent on it searching the B-Tree for the first file data
block... it doesn't take a large percentage of misses out of the
900 files to balloon into a multi-second startup. UFS happens to have
a direct blockmap from the inode. HAMMER caches an offset to the
disk block containing the B-Tree entry most likely to contain the
file data reference. HAMMER depends a lot more on B-Tree meta-data
caches not getting blown out of the system.
Some 400,000 files get accessed when using rdist or cvs to update
something like the NetBSD CVS repo (corecode's test). I can prevent
the vnodes used to read files from getting blown out by vnodes used
to stat files, but vnodes are not thrown away unless the related VM
pages are thrown away so there is probably a VM page priority
adjustment that also needs to be made to retain the longer-cached
meta-data in the face of multi-gigabyte directory tree scans.
Something corecode is doing from cron is physically reading (not just
stat()ing) a large number of files.
I will make some adjustments to the VM page priority for meta-data
returned by the buffer cache to the VM system as well as some
adjustments to the vnode reclamation code to reduce instances of
long-lived file vnodes getting blown out by read-once data.
-Matt
More information about the Bugs
mailing list