HAMMER lockup
Matthew Dillon
dillon at apollo.backplane.com
Mon Jun 30 19:44:57 PDT 2008
Ok, please test with the latest kernel & HAMMER commits.
(Make backups of any critical data beforehand, just in case)
The main issue I tracked down was a call path from the pageout
daemon. The pageout daemon packages up pages into a UIO_NOCOPY
VOP_WRITE. The filesystem then typically accesses the pages
via the buffer cache and so calls getblk().
Somtimes the getblk() covers more pages then the pageout daemon
packaged up, requiring the additional pages to be allocated.
This occurs more often with HAMMER because it uses a 64K block
size when writing out large files.
If the system has insufficient memory and these allocations fail,
the pageout daemon can deadlock.
The main fix I made is to allow the allocation of VM pages on
behalf of the buffer cache to dig into the interrupt reserve,
and to also attempt to free some VM pages from other clean buffers
in the buffer cache. When combined with the bwillwrite() work, which
tries to guarantee that no more then half the buffers in the
buffer cache are ever dirty, this *theoretically* should guarantee
that getblk() calls made by a filesystem will never have to block on
the VM system when allocating VM pages.
With these fixes the console may spew out messages like this:
"bio_page_alloc: WARNING emergency page allocation"
Which basically means 'I had to undertake the above emergency measures
to avoid a potential deadlock'. That's ok, and I will remove the
message before the release.
However, if the system spews out this:
"bio_page_alloc: WARNING emergency page allocation failed"
It means that my emergency measures failed to prevent the potential
deadlock. The code no longer just sleeps forever though, it will
continue to retry so the system may be able to recover from the
situation, but my goal is for the above 'blah blah blah failed' message
to never occur no matter what the situation.
If I have managed to fix the buffer cache to not block on the VM system
it will break the spiral of death that ends in a deadlock.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list