stacked vn(4) borkitude

Chris Turner c.turner at 199technologies.org
Mon Jan 28 05:45:11 PST 2008


To make things more flexible, I've started using one largish partition 
and creating vn disks for various uses underneath them.

last night I started to work on updating my vnconfig patch using this 
new scheme and got a corrupted filesystem as follows:

- vnconfig -c -s labels vn10 /path/to/home.img
- mount /dev/vn10s0a /home
- cd /home/niftyscriptness
- do some stuff which generates a disk image for vkernels
  dd, vnconfig, disklabel, newfs, mount, make installworld, etc.
  which mounts /dev/vn0s0a underneath /home
- strangeness occurs
basically, it seems like the 1.10.1 VFS/vn is getting confused when a VN 
is stacked on top of another vn.

First time, I did this procedure and the first 'mount' resulted in an
error (input/output error). Thinking I might have accidentally done
someting wrong with my vn allocation, I started over, and then
started to get wierd things in the working directory (layer 1 vn, holds 
the mountpoint for layer 2) - the files the vkernel image builder uses 
to keep track of things (.formatted, etc) were showing up in 'ls', but
ls -l would say 'no such file or directory'. Thinking a bug was upon me,
I rebooted, and when I tried to fsck the 'layer 1' /home VN, it reported 
many errors - 'fsck -y' essentially trashed the filesystem.

I started to repeat a second round of tests today after restoring /home

first time 'worked' - e.g. the initial mount of /dev/vn0s0a into 
/dev/vn10s0a's /home filesystem was ok, but the make installworld of
the Vkernel system paniced the system mid-way (sorry for copied trace - 
still need to get my debug infrastructure up to date)

panic
ffs_valloc
ufs_makeinode
ufs_create
ufs_vnoperate
vop_old_create
vop_compat_ncreate ? (cant read my writing :)
vop_default
vfs_vnoperate
vop_ncreate
vn_open
kern_open
sys_open
syscall2
Xint80_syscall
when I rebooted, the /home filesystem was ok, so I started the process
again, and got the same kind of corruption as before -
first try, things seemed ok, so I interrupted, unmounted, vnconfig 
-u'ed, etc & tried again -

on this try the first mount of the VN (vn0s0a) failed (input/output 
error), with a simultaneous console message :

dscheck(#vn/80): attempt to access nonexistent partition

and possibly (saw this at some point):

vn0: reading primary partition table error accessing offset 00000000 for 2

at this point, or shortly thereafter, doing an 'ls' within the layer 1 
/home filesystem came back blank, and 'cd .. ; ls -al' started yielding 
the 'no such file or directory' strangeness.

I rebooted, and the /home filesytem fsck'ed clean, but mounted empty -
df showed it as being 96% full, however (4G filesystem)
While typing this, I did realize that the script to create the 'layer 2' 
vn's was not leaving any label space in the disklabel - that being said 
I don't think that should cause corruption on the 'host' /home 
filesytem in any case.

Script was used many times before on a UP 'raw partition' /home -
just switched to a 1.10.1 SMP vn(4) /home - the new machine  seems 
otherwise stable.

one other note: /home was NFS exported but only mounted during the 
initial crash

pointers (or perhaps fixed pointers :) on the next steps welcome..

Thanks in advance,

- Chris





More information about the Bugs mailing list