Decision time.... should NATA become the default for this release?

Sepherosa Ziehau sepherosa at gmail.com
Sat Jun 2 22:50:11 PDT 2007


On 6/3/07, Matthew Dillon <dillon at apollo.backplane.com> wrote:
:When PCI_MAP_FIXUP is specified, the related dmesg is:
:http://leaf.dragonflybsd.org/mailarchive/kernel/2007-06/msg00013.htm
:
:The errors when booting the NATA kernel with PCI_MAP_FIXUP option are:
:Mounting root from ufs:/dev/ad0s3a
:pid 2 (sh), uid 0: exited on signal 11
:Jun  2 06:15:26 init: /bin/sh on /etc/rc terminated abnormally, going to single
:user mode
:Enter full pathname of shell or RETURN for /bin/sh:
:pid 3 (sh), uid 0: exited on signal 11
:Jun  2 06:15:38 init: single user shell terminated, restarting
:Enter full pathname of shell or RETURN for /bin/sh:
:
:Best Regards,
:sephe
    And NATA without PCI_MAP_FIXUP, what happens?  You get errors, or just
    a lockup?
    I've made a bunch of commits.  They didn't fix Sascha's issue (which I
    think is the same as yours), so I don't think they will fix yours,
    but update anyway so we are all testing the same thing.
    You may have to do the same thing Sascha will be doing, which is to
    build a HEAD nrelease CD and boot the box with the CD.  It should build
    with a /kernel.NATA as well as a /kernel (generic).  Interrupt the
    CD boot sequence menu option 6, and then 'boot /kernel.NATA'.  Assuming
    the CD is able to boot, you can then run tests on the hard drive
    (only do read-only mounts, or dd from the device directly) to figure
    out what is causing the corruption.
    What I suggest to start with, assuming you can boot a NATA CD, is to
    do this:
        dd if=/dev/ad0 bs=32k count=1024 | md5
        dd if=/dev/ad0 bs=32k count=1024 | md5
        dd if=/dev/ad0 bs=32k count=1024 | md5
        dd if=/dev/ad0 bs=32k count=1024 | md5
        dd if=/dev/ad0 bs=32k count=1024 | md5
    Just to see if basic reading works.  Then try different block sizes,
    etc etc... move up from there.
I can boot into single user mode.  Above command shows random data
corruption, i.e. run the command five times, one shows different md5
checksum.
    If you can find corruption, dd two somethings that will fit into a
    memory file and get them off the box.  On a working box hexdump and
I am going to do that.

    compare them to try to determine what kind of corruption is occuring.
Best Regards,
sephe
--
Live Free or Die




More information about the Kernel mailing list