HAMMER in real life
niklasro.appspot.com
niklasro at gmail.com
Thu Nov 26 22:49:09 PST 2009
fyi dear experts while I respect datamodel Trie before b-tree
[live at localhost ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
none 1.7G 26M 1.6G 2% /
[live at localhost ~]$ ssh remote at remoteniklascomputer
Warning: Permanently added 'remoteniklascomputer,remoteniklascomputer'
(RSA) to the list of known hosts.
techrev at techrevelation.com's password:
]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vzfs 10G 5.4G 4.7G 54% /
[techrev at ip-68-178-227-19 ~]$
On Sun, Nov 15, 2009 at 6:19 PM, Matthew Dillon
<dillon at apollo.backplane.com> wrote:
>
> :Matt used to use hardlinks for some sort of historical arrangement; after
> :a certain point, the total number of hardlinks was too much to handle. He
> :might have mentioned this somewhere in the archives. I don't know if this
> :would bite you the same way with gmirror.
> :
>
> Here's a quick summary:
>
> * First, a filesystem like UFS (and I think UFS2 but I'm not sure)
> is limited to 65536 hardlinks per inode. This limit is quickly
> reached when something like a CVS archive (which itself uses hardlinks
> in the CVS/ subdirectories) is backed up using the hardlink model.
> This results in a lot of data duplication and wasted storage.
>
> * Since directories cannot be hardlinked, directories are always
> duplicated for each backup. For UFS this is a disaster because
> fsck's memory use is partially based on the number of directories.
>
> * UFS's fsck can't handle large numbers of inodes. Once you get
> past a few tens of millions of inodes fsck explodes, not to mention
> can take 9+ hours to run even if it does not explode. This happened
> to me several times during the days where I used UFS to hold archival
> data and for backups. Everything worked dandy until I actually had
> to fsck.
>
> Even though things like background fsck exist, it's never been stable
> enough to be practical in a production environment, and even if it were
> it eats disk bandwidth potentially for days after a crash. I don't
> know if that has changed recently or not.
>
> The only work around is to not store tens of millions of inodes on a
> UFS filesystem.
>
> * I believe that FreeBSD was talking about adopting some of the LFS work,
> or otherwise implementing log space for UFS. I don't know what the
> state of this is but I will say that it's tough to get something like
> this to work right without a lot of actual plug-pulling tests.
>
> Either OpenBSD or NetBSD I believe have a log structured extension to
> UFS which works. Not sure which, sorry.
>
> With something like ZFS one would use ZFS's snapshots (though they aren't
> as fine-grained as HAMMER snapshots). ZFS's snapshots work fairly well
> but have higher maintanance overheads then HAMMER snapshots when one is
> trying to delete a snapshot. HAMMER can delete several snapshots in a
> single pass so the aggregate maintainance overhead is lower.
>
> With Linux... well, I don't know which filesystem you'd use. ext4 maybe,
> if they've fixed the bugs. I've used reiser in the past (but obviously
> that isn't desireable now).
>
> --
>
> For HAMMER, both Justin and I have been able to fill up multi-terrabyte
> filesystems running bulk pkgsrc builds with default setups. It's fairly
> easy to fix by adjusting up the HAMMER config (aka hammer viconfig
> <filesystem>) run times for pruning and reblocking.
>
> Bulk builds are a bit of a special case. Due to the way they work a
> bulk build rm -rf's /usr/pkg for EACH package it builds, then
> reconstructs it by installing the necessary dependencies previously
> created before building the next package. This eats disk space like
> crazy on a normal HAMMER mount. It's more managable if one did a
> 'nohistory' HAMMER mount but my preference, in general, is to use a
> normal mount.
>
> HAMMER does not implement redundancy like ZFS, so if redundancy is
> needed you'd need to use a RAID card. For backup systems I typically
> don't bother with per-filesystem redundancy since I have several copies
> on different machines already. Not only do the (HAMMER) production
> machines have somewhere around 60 days worth of snapshtos on them,
> but my on-site backup box has 100 days of daily snapshots and my
> off-site backup box has almost 2 years of weekly snapshots.
>
> So if the backups fit on one or two drives additional redundancy isn't
> really beneficial. More then that and you'd definitely want RAID.
>
> -Matt
> Matthew Dillon
> <dillon at backplane.com>
>
>
More information about the Users
mailing list