Futures - HAMMER comparison testing?

Matthew Dillon dillon at apollo.backplane.com
Fri Jan 18 00:37:19 PST 2008


:But - at the end of the day - how much [extra?] on-disk space will be 
:needed to insure mount 'as-of' is 'good enough' for some realisitic span 
:(a week?, a month?)? 'Forever' may be too much to ask.

    The amount of disk needed is precisely the same as the amount of
    historical data (different from current data) that must be retained,
    plus record overhead.

    So it comes down to how much space you are willing to eat up to store
    the history, and what kind of granularity you will want for the history.

:How close are we to being able to start predicting that storage-space 
:efficiency relative to ${some_other_fs}?
:
:Bill

    Ultimately it will be extremely efficient simply by the fact that
    there will be a balancer going through it and repacking it.

    For the moment (and through the alpha release) it will be fairly
    inefficient because it is using fixed 16K data records, even for small
    files.  The on-disk format doesn't care... records can reference 
    variable-length data from around 1MB down to 64 bytes.  But supporting
    variable-length data requires implementing some overwrite cases that
    I don't want to do right now.  This only applies to regular files
    of course.  Directories store directory entries as records, not as data,
    so directories are packed really nicely. 

    e.g. if you have one record representing, say, 1MB of data, and you
    write 64 bytes right smack in the middle of that, the write code will
    have to take that one record, mark it as deleted, then create three
    records to replace it (one pointing to the unchanged left portion of
    the original data, one pointing to the 64 bytes of overwritten data,
    and one pointing to the unchanged right portion of the original data).
    The recovery and deletion code will also have to deal with that sort
    of overlayed data situation.  I'm not going to be writing that
    feature for a bit.  There are some quick hacks I can do too, for
    small files, but its not on my list prior to the alpha release.

    Remember that HAMMER is designed for large filesystems which don't fill
    up instantly.  Consequently it will operate under the assumption that
    it can take its time to recover free space.  If one doesn't want to use
    the history feature one can turn it off, of course, or use a very
    granular retention policy.

    My local backup system is currently using a 730GB UFS partition and it
    is able to backup apollo, crater, and leaf with daily cpdups (using
    the hardlink snapshot trick) going back about 3 months.  In fact, I
    can only fill up that 730GB about half way because fsck runs out of
    memory and fails once you get over around 50 million inodes (mostly
    dependant on the number of directories you have)... on UFS that is.
    I found that out the hard way.  It takes almost a day for fsck to
    recover the filesystem even half full.  I'll be happy when I can throw
    that old stuff away.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Users mailing list