Errors on SSD
dillon at apollo.backplane.com
Sat Sep 15 08:23:03 PDT 2012
:Model Family: SandForce Driven SSDs
:Device Model: OCZ-VERTEX2
:Serial Number: OCZ-CU25VMZ6117F3NFM
SandForce controller... those have historically had many bugs over
the years, and OCZ's firmware has also historically had many bugs
over the years. There might be a firmware update, which you should
look into via OCZ's web site (google for it maybe).
(Of course, updating firmware on a SSD has its own problems, usually
requiring one to throw together a DRDOS bootable usb stick to run the
firmware updating software).
:Firmware Version: 1.29
:User Capacity: 60,022,480,896 bytes
: ERR=...., SC=...., LL=...., LM=...., LH=...., DEV=...., STS=....
:SMART overall-health self-assessment test result: PASSED
:Warning: This result is based on an Attribute check.
Test result is mostly meaningless.
:ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
: 1 Raw_Read_Error_Rate 0x000f 110 102 050 Pre-fail Always - 0/32677182
No read errors according to the device.
: 5 Retired_Block_Count 0x0033 100 100 003 Pre-fail Always - 0
: 9 Power_On_Hours_and_Msec 0x0032 100 100 000 Old_age Always - 681h+41m+10.690s
681 hours of power-on time. Neither here nor there.
: 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 387
The device was power-cycled 387 times so far. Neither here nor
:171 Program_Fail_Count 0x0032 000 000 000 Old_age Always - 0
:172 Erase_Fail_Count 0x0032 000 000 000 Old_age Always - 0
No program or erase failures. That's good.
:174 Unexpect_Power_Loss_Ct 0x0030 000 000 000 Old_age Offline - 156
This just means the plug was pulled on the device without
giving it the shutdown command, 156 times. Usually not an issue.
:177 Wear_Range_Delta 0x0000 000 000 000 Old_age Offline - 0
:181 Program_Fail_Count 0x0032 000 000 000 Old_age Always - 0
:182 Erase_Fail_Count 0x0032 000 000 000 Old_age Always - 0
:187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
:194 Temperature_Celsius 0x0022 030 129 000 Old_age Always - 30 (Min/Max 30/30)
:195 ECC_Uncorr_Error_Count 0x001c 110 102 000 Old_age Offline - 0/32677182
Not sure what this means but it looks ok.
:196 Reallocated_Event_Count 0x0033 100 100 000 Pre-fail Always - 0
No reallocations needed. Neither here nor there but 0 is good.
:231 SSD_Life_Left 0x0013 100 100 010 Pre-fail Always - 0
The SSD has basically no wear on it. 100% life left,
10% worth of reserve sectors (above and beyond the stated
capacity) left. Very good.
:233 SandForce_Internal 0x0000 000 000 000 Old_age Offline - 192
:234 SandForce_Internal 0x0032 000 000 000 Old_age Always - 384
:241 Lifetime_Writes_GiB 0x0032 000 000 000 Old_age Always - 384
:242 Lifetime_Reads_GiB 0x0032 000 000 000 Old_age Always - 576
Insofar as the SSD is concerned everything is working properly.
At least that means there's a good chance that the errors you
are seeing are NOT media errors, but firmware issues.
On the plus side we know it's an OCZ SSD which means that you
need to research possible firmware upgrades for the device.
OCZ has had many, MANY bad firmwares over the years. Far
more than other SSD vendors.
I'm not sure regarding the AHCI errors you are getting.
Could it be related to the kernel? Well, not likely.
We've actually fixed a few bugs in the AHCI driver in
later kernels, but I doubt any of those could cause new
errors to occur.
AHCI reports your SSD as:
ahci0.2: Found DISK "OCZ-VERTEX2 1.29" serial="OCZ-CU25VMZ6117F3NFM"
So, version 1.29 firmware. I went to OCZ's site and the current
firmware release is 1.37.
So I think it is definitely worth going through the hell of
updating the firmware.
<dillon at backplane.com>
More information about the Users