kernel work week of 3-Feb-2010 HEADS UP

Matthew Dillon dillon at apollo.backplane.com
Fri Feb 5 11:09:29 PST 2010


    Ok, on the wear leveling I've done some further research but there
    seems to be a lot of confusion.  For Intel's 40G MLC drives they seem
    to be saying that 40TB is the write endurance.  Everyone seems to agree
    that the write endurance increases with scale so e.g. the 80G drive
    would have double the endurance.

    This means the MLC cells either have a 1000 write cycle endurance
    and Intel is using static wear leveling, or the cells have a 10,000
    write cycle endurance and Intel is not using static wear leveling.
    I can't tell which it is.

    In other articles and such I hear the phrase '10GB/day for 5 years',
    but a 40TB write endurance would be 10GB/day for 10 years.  I don't
    know which is the case.

    If we go with the 10GB/day concept then the continuous write rate
    is limited to around 100KB/sec (8.8G/day).

    --

    For the swapcache this implies that I should set the initial burst
    after reboot to something like 2-3 GB and the accumulation rate
    to 100K/s or so.  This presumes the machine doesn't reboot very often
    and gives the system a nice burst cache loadin after boot, but
    then regulates the write rate once the burst is exhausted (if it
    ever is since the accumulation rate is constantly being added back
    into it).

    To be clear here the burst value can build up over time if the
    system does not find anything to write to the SSD, based on the
    accumulation rate.  So it is possible to have multiple bursts over
    time but still stay within the average bandwidth limits set in
    the sysctl.

    At the moment I am setting the defaults to 1G burst and 1MB/s.

    So far in my testing it is clear that we want a pretty hefty burst
    at the beginning.  Once we reach steady state (say the swap space
    reaches its max 3/4 full limit) then the write rate effectively
    becomes an eviction/replace rate.  Note: I haven't written any
    eviction code yet, beyond what happens naturally when vnodes get
    recycled.  So as long as the use footprint doesn't change radically
    it should be able to keep up.  Potentially if the access footprint
    has periods of stability the burst value can rebuild over time,
    so the next radical change in the access footprint is able to burst
    a fresh set of data into the SSD.

						-Matt






More information about the Kernel mailing list