failing disk, or not?
Bill Hacker
wbh at conducive.org
Mon Mar 28 20:25:04 PST 2005
George Georgalis wrote:
I'm seeing some disk errors in dfly that I cannot reproduce with other
OS checking the partition:
ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
ad4: UDMA ICRC error writing fsbn 488278443 of 229605584-229605599 (ad4 bn 488278443; cn 30393 tn 236 sn 30) retrying
This happened while running dvdbackup and I reproduced it running
a dd read from the partition. However, after several attempts I cannot
reproduce it from Linux badblocks (read or non-distructive write) check
or linux dd read from the partition. I know failures can be intermetint
But not getting any errors at all yet, from Linux, seems odd at this
point, if the disk is really failing.
Might DFLY be attempting I/O beyond the permitted
end of the assigned area? Or to an area that Linux
is not trying to access?
# df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/ad4s3a 248M 122M 106M 54% /
/dev/ad4s3d 248M 1.3M 227M 1% /var
/dev/ad4s3e 124G 94G 20G 83% /usr
procfs 4.0K 4.0K 0B 100% /proc
A bit of history, I did have a system lockup -- I could switch virtural
terminals but no keyboard input was accepted -- a week or two ago,
didn't file bug because I was half-hazard experimenting (in user space)
and couldn't explain well enough, at the time all I was doing, now I
don't even remember. A fsck was required, and with a 95Gb /usr, that
took quite a while. (welcome comments on why softupdates didn't help
here),
Best case, SU just leave data in an earlier state rather than
half-committed. More transaction-oriented than jornalling.
fsck -y doesn't care about the content of data - only about its
proper file indexing, so *maybe* some time saved during
a 'preen', but no savings at all with fsck -y.
> also the /usr partition was near or over 100% capacity, but I
never got disk full errors, ie didn't *completely* run out of space.
It normally has around a 10% reserve, will usually stand 102% before it
even throws an error message.
At this point can I be sure my disk is failing or could there be some
driver instability? The full dmesg is below.
Don't see it in dmesg, but ad4 is a 200Gb Seagate drive, on a nvidia
sata controler. Disk Product Number ST3200822AS, Part Number 9W2854-301
Thanks,
// George
Cutting ...
agp0: <NVIDIA Generic AGP Controller> mem 0xe0000000-0xe3ffffff at device 0.0 on pci0
agp0: Unable to find NVIDIA Memory Controller 1.
Unable? That's odd ?
device_probe_and_attach: agp0 attach returned 19
isab0: <PCI to ISA bridge (vendor=10de device=00e0)> at device 1.0 on pci0
isa0: <ISA bus> on isab0
pci0: <unknown card> (vendor=0x10de, dev=0x00e4) at 1.1 irq 10
NVIDIA - nForce3 250 SMBus Controller ?
*SNIP*
atapci0: <Generic PCI ATA controller> port 0xf000-0xf00f at device 8.0 on pci0
ata0: at 0x1f0 irq 14 on atapci0
installed MI handler for int 14
ata1: at 0x170 irq 15 on atapci0
installed MI handler for int 15
atapci1: <Generic PCI ATA controller> port 0xec00-0xec7f,0xeb00-0xeb0f,0xb70-0xb73,0x970-0x977,0xbf0-0xbf3,0x9f0-0x9f7 irq 11 at device 10.0 on pci0
ata2: at 0x9f0 on atapci1
installed MI handler for int 11
ata3: at 0x970 on atapci1
*snip*
ad0: 58644MB <Maxtor 6Y060L0> [119150/16/63] at ata0-master BIOSDMA
ad4: DMA limited to UDMA33, non-ATA66 cable or device
ad4: 190782MB <ST3200822AS> [387621/16/63] at ata2-master BIOSDMA
I'm puzzled:
- ata0-master claims /dev/ad0
- ata1-master claims /dev/acd0
- ata2-master claims /dev/ad4
- ata3 seems empty...
So how do we skip /dev/ad1, /dev/ad2, and /dev/ad3 to arive at /dev/ad4?
Mounting root from ufs:/dev/ad4s3a
ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
ad4: UDMA ICRC error writing fsbn 488278443 of 229605584-229605599 (ad4 bn 488278443; cn 30393 tn 236 sn 30) retrying
You are on slice2, presumably well up in the cylinder count.
Might the areas above be a geometry mapping conflict?
Bill
More information about the Bugs
mailing list