panic: assertion: layer2->zone == zone in hammer_blockmap_free

YONETANI Tomokazu qhwt+dfly at les.ath.cx
Mon Aug 4 02:28:08 PDT 2008


On Sun, Aug 03, 2008 at 09:42:19AM -0700, Matthew Dillon wrote:
>     I think I know what happened, too.  There were some commits right around
>     that time that fixed a case where HAMMER's buffer cache was being
>     flushed by the kernel on a panic, causing things to really get out
>     of sync because the volume header's UNDO pointers wouldn't get updated
>     in that case.  We really want to just throw those buffers away.  I'll bet
>     what happened was that you had a crash related to the cross link tests
>     on a kernel without those fixes, and the kernel flushed out HAMMER's
>     unsynchronized meta-data and caused the blockmap to get out of sync.
> 
>     I think it would be prudent for me to do some more crash testing, to
>     make sure that bug got fixed.  The base bug your reported can be
>     considered closed though!

It's nice to hear that it's already been fixed in our 2.0 release.

>     I've found two more assertions while testing reblocking and pruning
>     at the same time.  One is related to a buffer alias occuring from
>     the reblocking, panicing in hammer_io_new().  The other is related to
>     a B-Tree sanity check related to I think the pruner (a window of
>     opportunity when the B-Tree is deleting a chain of nodes where an
>     insertion can occur, causing the chain deletion to assert on the node
>     being unexpectedly non-empty).  I should be able to get both fixed
>     fairly quickly and they shouldn't cause any media corruption.
> 
>     These are all related to heavy simultanious pruning and reblocking
>     only, not normal use.

In case you're interested, I put the relavant part of my /var/log/messages
as ~y0netan1/crash/message.gz.  I think I did most operations through
sudo, so the command line arguments are recorded in it.  While looking
through it, I realized I overlooked another panic, but maybe this was
the result of the previous panics.  I've been doing the same operation
after newfs_hammer'ing the partition but haven't managed to reproduce
any of these panics.

Jul 29 18:32:02 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer pfs-update slave label=slave of obj
Jul 29 18:32:37 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy slave
Jul 29 18:32:48 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:40:19 firebolt kernel: Warning: BTREE_REMOVE: Defering parent removal2 @ 8000000293323000, skipping
Jul 29 18:40:26 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:41:22 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER/@@0x000000017a500990:00002 ; USER=root ; COMMAND=/bin/rm -rf /var/obj/current
Jul 29 18:41:26 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:41:48 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer prune-everything /HAMMER/obj/
Jul 29 18:41:53 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer prune-everything /HAMMER/slave
Jul 29 18:42:07 firebolt sudo:     qhwt : TTY=ttyp3 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer mirror-copy obj slave
Jul 29 18:42:08 firebolt kernel: HAMMER(HAMMER): Critical error inode=4427368967 while syncing inode
Jul 29 18:42:08 firebolt kernel: HAMMER(HAMMER): Forcing read-only mode
Jul 29 18:43:48 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER ; USER=root ; COMMAND=/sbin/hammer pfs-destroy foo
Jul 29 18:44:19 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER ; USER=root ; COMMAND=/bin/rm foo
Jul 29 18:44:25 firebolt sudo:     qhwt : TTY=ttyp4 ; PWD=/HAMMER ; USER=root ; COMMAND=/bin/rm foo
Jul 29 18:44:43 firebolt su: BAD SU qhwt to root on /dev/ttyp4
Jul 29 18:45:01 firebolt su: qhwt to root on /dev/ttyp3
Jul 29 21:00:59 firebolt syslogd: kernel boot file is /kernel
Jul 29 21:00:59 firebolt kernel: HAMMER read-only -> read-write
Jul 29 21:00:59 firebolt kernel: panic: assertion: buffer->io.lock.refs == 0 in hammer_recover_flush_buffer_callback
Jul 29 21:00:59 firebolt kernel: mp_lock = 00000000; cpuid = 0
Jul 29 21:00:59 firebolt kernel: boot() called on cpu#0
Jul 29 21:00:59 firebolt kernel:
Jul 29 21:00:59 firebolt kernel: syncing disks... 778 777 777 777 777 777 777 777 781 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777 777
 
>     Just in case you are doing that this is a head's up that I am aware
>     of them.

Thanks.





More information about the Bugs mailing list