panic: assertion: pmap->pm_stats.resident_count > 0 in pmap_release_free_page

Tue Dec 29 10:38:18 PST 2009

:Another panic from sys_vmspace_mcontrol by a slightly different workload
:on vkernel (actually I caught two panics, but unfortunately the first
:one ended up in an endless loop of some spinlock panics then locked up,
:so all I could do was to press the reset switch).  The kernel and the vmcore
:are at ~y0netan1/crash/{kern,vmcore}.23 .

    The crash dumps are definitely providing more information now.

    It sees a pte entry for which there is no corresponding pv.  The
    page is mapped to multiple pmaps (but not the particular one that
    we paniced on), its a vkernel virtual process stack page, so it
    was fork()ed a few times.... hence why more cpus can reproduce
    the race.

    We caught the problem earlier this time.  If it had been allowed
    to go through it would have cleared the pte entry and then later
    on paniced when it found an zero pte entry.  I think there are
    two possibilities.

    * The first is that the pmap system is allocating a page marked
      PG_ZERO which isn't actually an empty page, then encountered
      the junk pte later on.

    * The second is that there is another race where the temporary
      page table mapping gets blown up and causes a pte to be entered
      into the wrong pmap, then later we encounter the bad pte and
      can't find its corresponding pv.

    Here's a new patch to try.  I added a bunch more assertions to
    try to catch it and I also check to make sure PG_ZERO'd pages
    are zero (which is expensive but...).

	fetch http://apollo.backplane.com/DFlyMisc/pmap02.patch

						-Matt