kernel work week of 3-Feb-2010 HEADS UP
Matthew Dillon
dillon at apollo.backplane.com
Fri Feb 5 11:09:29 PST 2010
Ok, on the wear leveling I've done some further research but there
seems to be a lot of confusion. For Intel's 40G MLC drives they seem
to be saying that 40TB is the write endurance. Everyone seems to agree
that the write endurance increases with scale so e.g. the 80G drive
would have double the endurance.
This means the MLC cells either have a 1000 write cycle endurance
and Intel is using static wear leveling, or the cells have a 10,000
write cycle endurance and Intel is not using static wear leveling.
I can't tell which it is.
In other articles and such I hear the phrase '10GB/day for 5 years',
but a 40TB write endurance would be 10GB/day for 10 years. I don't
know which is the case.
If we go with the 10GB/day concept then the continuous write rate
is limited to around 100KB/sec (8.8G/day).
--
For the swapcache this implies that I should set the initial burst
after reboot to something like 2-3 GB and the accumulation rate
to 100K/s or so. This presumes the machine doesn't reboot very often
and gives the system a nice burst cache loadin after boot, but
then regulates the write rate once the burst is exhausted (if it
ever is since the accumulation rate is constantly being added back
into it).
To be clear here the burst value can build up over time if the
system does not find anything to write to the SSD, based on the
accumulation rate. So it is possible to have multiple bursts over
time but still stay within the average bandwidth limits set in
the sysctl.
At the moment I am setting the defaults to 1G burst and 1MB/s.
So far in my testing it is clear that we want a pretty hefty burst
at the beginning. Once we reach steady state (say the swap space
reaches its max 3/4 full limit) then the write rate effectively
becomes an eviction/replace rate. Note: I haven't written any
eviction code yet, beyond what happens naturally when vnodes get
recycled. So as long as the use footprint doesn't change radically
it should be able to keep up. Potentially if the access footprint
has periods of stability the burst value can rebuild over time,
so the next radical change in the access footprint is able to burst
a fresh set of data into the SSD.
-Matt
More information about the Kernel
mailing list