Futures - HAMMER comparison testing?
Matthew Dillon
dillon at apollo.backplane.com
Fri Jan 18 00:37:19 PST 2008
:But - at the end of the day - how much [extra?] on-disk space will be
:needed to insure mount 'as-of' is 'good enough' for some realisitic span
:(a week?, a month?)? 'Forever' may be too much to ask.
The amount of disk needed is precisely the same as the amount of
historical data (different from current data) that must be retained,
plus record overhead.
So it comes down to how much space you are willing to eat up to store
the history, and what kind of granularity you will want for the history.
:How close are we to being able to start predicting that storage-space
:efficiency relative to ${some_other_fs}?
:
:Bill
Ultimately it will be extremely efficient simply by the fact that
there will be a balancer going through it and repacking it.
For the moment (and through the alpha release) it will be fairly
inefficient because it is using fixed 16K data records, even for small
files. The on-disk format doesn't care... records can reference
variable-length data from around 1MB down to 64 bytes. But supporting
variable-length data requires implementing some overwrite cases that
I don't want to do right now. This only applies to regular files
of course. Directories store directory entries as records, not as data,
so directories are packed really nicely.
e.g. if you have one record representing, say, 1MB of data, and you
write 64 bytes right smack in the middle of that, the write code will
have to take that one record, mark it as deleted, then create three
records to replace it (one pointing to the unchanged left portion of
the original data, one pointing to the 64 bytes of overwritten data,
and one pointing to the unchanged right portion of the original data).
The recovery and deletion code will also have to deal with that sort
of overlayed data situation. I'm not going to be writing that
feature for a bit. There are some quick hacks I can do too, for
small files, but its not on my list prior to the alpha release.
Remember that HAMMER is designed for large filesystems which don't fill
up instantly. Consequently it will operate under the assumption that
it can take its time to recover free space. If one doesn't want to use
the history feature one can turn it off, of course, or use a very
granular retention policy.
My local backup system is currently using a 730GB UFS partition and it
is able to backup apollo, crater, and leaf with daily cpdups (using
the hardlink snapshot trick) going back about 3 months. In fact, I
can only fill up that 730GB about half way because fsck runs out of
memory and fails once you get over around 50 million inodes (mostly
dependant on the number of directories you have)... on UFS that is.
I found that out the hard way. It takes almost a day for fsck to
recover the filesystem even half full. I'll be happy when I can throw
that old stuff away.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list