Decision time.... should NATA become the default for this release?
Matthew Dillon
dillon at apollo.backplane.com
Sat Jun 2 22:10:17 PDT 2007
:When PCI_MAP_FIXUP is specified, the related dmesg is:
:http://leaf.dragonflybsd.org/mailarchive/kernel/2007-06/msg00013.htm
:
:The errors when booting the NATA kernel with PCI_MAP_FIXUP option are:
:Mounting root from ufs:/dev/ad0s3a
:pid 2 (sh), uid 0: exited on signal 11
:Jun 2 06:15:26 init: /bin/sh on /etc/rc terminated abnormally, going to single
:user mode
:Enter full pathname of shell or RETURN for /bin/sh:
:pid 3 (sh), uid 0: exited on signal 11
:Jun 2 06:15:38 init: single user shell terminated, restarting
:Enter full pathname of shell or RETURN for /bin/sh:
:
:Best Regards,
:sephe
And NATA without PCI_MAP_FIXUP, what happens? You get errors, or just
a lockup?
I've made a bunch of commits. They didn't fix Sascha's issue (which I
think is the same as yours), so I don't think they will fix yours,
but update anyway so we are all testing the same thing.
You may have to do the same thing Sascha will be doing, which is to
build a HEAD nrelease CD and boot the box with the CD. It should build
with a /kernel.NATA as well as a /kernel (generic). Interrupt the
CD boot sequence menu option 6, and then 'boot /kernel.NATA'. Assuming
the CD is able to boot, you can then run tests on the hard drive
(only do read-only mounts, or dd from the device directly) to figure
out what is causing the corruption.
What I suggest to start with, assuming you can boot a NATA CD, is to
do this:
dd if=/dev/ad0 bs=32k count=1024 | md5
dd if=/dev/ad0 bs=32k count=1024 | md5
dd if=/dev/ad0 bs=32k count=1024 | md5
dd if=/dev/ad0 bs=32k count=1024 | md5
dd if=/dev/ad0 bs=32k count=1024 | md5
Just to see if basic reading works. Then try different block sizes,
etc etc... move up from there.
If you can find corruption, dd two somethings that will fit into a
memory file and get them off the box. On a working box hexdump and
compare them to try to determine what kind of corruption is occuring.
At the moment I've run out of ideas. The chip registers are being
set properly, so that pretty much leaves command initiation and
interrupt timing issues.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list