ATA errors.

YONETANI Tomokazu qhwt at myrealbox.com
Fri Nov 28 09:59:56 PST 2003


On Sun, Nov 23, 2003 at 10:22:47PM -0800, Matthew Dillon wrote:
> 
> :Hello.
> :
> :...
> :Adam, have you tried FreeBSD-STABLE, whose kernel built with exactly
> :the same kernel config as DragonFly, on the same computer?
> :I've reported about the same ATA errors in the middle of this month
> :on kernel@, and later on bugs at . There're three DragonFly machines
> :I have access:
> :
> :A. Dynabook: ATA66, 256Mbytes of main RAM, P-III 1.3GHz
> :B. Dell: ATA100, 256Mbytes of main RAM, P-IV 1.7GHz
> :C. Some SiS-MB based: ATA100/133, 512Mbytes of RAM, P-IV 2.4B
> :
> :All three machines are ACPI capable and enabled in the kernel, and
> :none of them are SMP or HTT. All of them are running DragonFly from
> :source as of about a week ago.
> :And Only A and B exhibit the symptom. And on C I've never seen that
> :error messages. On A, I haven't seen the error message while doing
> :buildworld or buildkernel, only when doing fsck on largest partition,
> :even though this one is most actively used machine. And I've never seen
> :the error when running FreeBSD-STABLE on B.
> :
> :At the moment, B is the only machine I can easily switch between
> :DragonFly and FreeBSD-STABLE; I have installed FreeBSD-stable base system
> :into /dev/ad0s1a and DragonFly base system into /dev/ad0s1d. If I have
> :time, I'm going to try different kernel config options to see if I can
> :narrow down which options are tickling kernel.
> 
>     I don't think its the cable.  It sounds like it's the PIO/DMA mode
>     being selected.
> 
>     It would be interesting to see what 'atacontrol mode N' reports
>     on both DragonFly and FreeBSD for the box that you can easily
>     switch OS's.  e.g. 'atacontrol mode 0' and 'atacontrol mode 1'
>     0, 1, 2, 3... however many ata channels you have.

I found that there's a place where ATA driver silently falls back to
BIOSPIO mode even when bootverbose is true. In ad_start(), if the
call to ata_dmaalloc fails, adp->device->mode is set to ATA_PIO, but
there's no ata_prtdev() around there, that's why we don't see any
`fallback' messages. I'm not sure why ata_dmaalloc() failed.
Anyway, attached is a patch to add ata_prtdev(), and add a knob
hw.ata.panic_on_dma_failure(default: 0).

To use this:

- apply the patch, compile and install the kernel.
- reboot into boot prompt and set variables:
  set boot_single=YES
  set boot_verbose=YES
  set hw.ata.panic_on_dma_failure=1
  boot
- mount /usr (or relatively large partition) as read-write; optionally
  run /etc/rc.d/swap1 start if the partition is too large that fsck'ing
  it requires swap.
- do some disk-intensive work for a few minutes:
  $ cp -pr /usr/ports /usr/ports.test
- umount /usr and fsck it.

You'll find it take some time to make it happen.

By the way, on my Dynabook fsck stuck during pass 2 in [pfault] state.
When this happens there's nothing I can do but reset.
Index: ata-disk.c
===================================================================
RCS file: /home/source/dragonfly/cvs/src/sys/dev/disk/ata/ata-disk.c,v
retrieving revision 1.7
diff -u -r1.7 ata-disk.c
--- ata-disk.c	7 Aug 2003 21:16:51 -0000	1.7
+++ ata-disk.c	28 Nov 2003 13:10:28 -0000
@@ -95,9 +95,11 @@
 static int ata_dma = 1;
 static int ata_wc = 1;
 static int ata_tags = 0; 
+static int ata_panic_on_dma_failure = 0; 
 TUNABLE_INT("hw.ata.ata_dma", &ata_dma);
 TUNABLE_INT("hw.ata.wc", &ata_wc);
 TUNABLE_INT("hw.ata.tags", &ata_tags);
+TUNABLE_INT("hw.ata.panic_on_dma_failure", &ata_panic_on_dma_failure);
 static MALLOC_DEFINE(M_AD, "AD driver", "ATA disk driver");
 
 /* sysctl vars */
@@ -412,8 +414,12 @@
     if (bp->b_flags & B_READ) 
 	request->flags |= ADR_F_READ;
     if (adp->device->mode >= ATA_DMA) {
-	if (!(request->dmatab = ata_dmaalloc(atadev->channel, atadev->unit)))
+	if (!(request->dmatab = ata_dmaalloc(atadev->channel, atadev->unit))) {
+	    if (ata_panic_on_dma_failure)
+		panic("ad_start: ata_dmaalloc failed");
+	    ata_prtdev(atadev, "ata_dmaalloc failed, fallback to PIO mode\n");
 	    adp->device->mode = ATA_PIO;
+	}
     }
 
     /* insert in tag array */




More information about the Kernel mailing list