HAMMER in real life
Matthew Dillon
dillon at apollo.backplane.com
Sun Nov 15 10:22:11 PST 2009
:Matt used to use hardlinks for some sort of historical arrangement; after
:a certain point, the total number of hardlinks was too much to handle. He
:might have mentioned this somewhere in the archives. I don't know if this
:would bite you the same way with gmirror.
:
Here's a quick summary:
* First, a filesystem like UFS (and I think UFS2 but I'm not sure)
is limited to 65536 hardlinks per inode. This limit is quickly
reached when something like a CVS archive (which itself uses hardlinks
in the CVS/ subdirectories) is backed up using the hardlink model.
This results in a lot of data duplication and wasted storage.
* Since directories cannot be hardlinked, directories are always
duplicated for each backup. For UFS this is a disaster because
fsck's memory use is partially based on the number of directories.
* UFS's fsck can't handle large numbers of inodes. Once you get
past a few tens of millions of inodes fsck explodes, not to mention
can take 9+ hours to run even if it does not explode. This happened
to me several times during the days where I used UFS to hold archival
data and for backups. Everything worked dandy until I actually had
to fsck.
Even though things like background fsck exist, it's never been stable
enough to be practical in a production environment, and even if it were
it eats disk bandwidth potentially for days after a crash. I don't
know if that has changed recently or not.
The only work around is to not store tens of millions of inodes on a
UFS filesystem.
* I believe that FreeBSD was talking about adopting some of the LFS work,
or otherwise implementing log space for UFS. I don't know what the
state of this is but I will say that it's tough to get something like
this to work right without a lot of actual plug-pulling tests.
Either OpenBSD or NetBSD I believe have a log structured extension to
UFS which works. Not sure which, sorry.
With something like ZFS one would use ZFS's snapshots (though they aren't
as fine-grained as HAMMER snapshots). ZFS's snapshots work fairly well
but have higher maintanance overheads then HAMMER snapshots when one is
trying to delete a snapshot. HAMMER can delete several snapshots in a
single pass so the aggregate maintainance overhead is lower.
With Linux... well, I don't know which filesystem you'd use. ext4 maybe,
if they've fixed the bugs. I've used reiser in the past (but obviously
that isn't desireable now).
--
For HAMMER, both Justin and I have been able to fill up multi-terrabyte
filesystems running bulk pkgsrc builds with default setups. It's fairly
easy to fix by adjusting up the HAMMER config (aka hammer viconfig
<filesystem>) run times for pruning and reblocking.
Bulk builds are a bit of a special case. Due to the way they work a
bulk build rm -rf's /usr/pkg for EACH package it builds, then
reconstructs it by installing the necessary dependencies previously
created before building the next package. This eats disk space like
crazy on a normal HAMMER mount. It's more managable if one did a
'nohistory' HAMMER mount but my preference, in general, is to use a
normal mount.
HAMMER does not implement redundancy like ZFS, so if redundancy is
needed you'd need to use a RAID card. For backup systems I typically
don't bother with per-filesystem redundancy since I have several copies
on different machines already. Not only do the (HAMMER) production
machines have somewhere around 60 days worth of snapshtos on them,
but my on-site backup box has 100 days of daily snapshots and my
off-site backup box has almost 2 years of weekly snapshots.
So if the backups fit on one or two drives additional redundancy isn't
really beneficial. More then that and you'd definitely want RAID.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list