HAMMER in real life

Thu Nov 26 22:49:09 PST 2009

fyi dear experts while I respect datamodel Trie before b-tree
[live at localhost ~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
none                  1.7G   26M  1.6G   2% /
[live at localhost ~]$ ssh remote at remoteniklascomputer
Warning: Permanently added 'remoteniklascomputer,remoteniklascomputer'
(RSA) to the list of known hosts.
techrev at techrevelation.com's password:
]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/vzfs              10G  5.4G  4.7G  54% /
[techrev at ip-68-178-227-19 ~]$

On Sun, Nov 15, 2009 at 6:19 PM, Matthew Dillon
<dillon at apollo.backplane.com> wrote:
>
> :Matt used to use hardlinks for some sort of historical arrangement; after
> :a certain point, the total number of hardlinks was too much to handle.  He
> :might have mentioned this somewhere in the archives.  I don't know if this
> :would bite you the same way with gmirror.
> :
>
>    Here's a quick summary:
>
>    * First, a filesystem like UFS (and I think UFS2 but I'm not sure)
>      is limited to 65536 hardlinks per inode.  This limit is quickly
>      reached when something like a CVS archive (which itself uses hardlinks
>      in the CVS/ subdirectories) is backed up using the hardlink model.
>      This results in a lot of data duplication and wasted storage.
>
>    * Since directories cannot be hardlinked, directories are always
>      duplicated for each backup.  For UFS this is a disaster because
>      fsck's memory use is partially based on the number of directories.
>
>    * UFS's fsck can't handle large numbers of inodes.  Once you get
>      past a few tens of millions of inodes fsck explodes, not to mention
>      can take 9+ hours to run even if it does not explode.  This happened
>      to me several times during the days where I used UFS to hold archival
>      data and for backups.  Everything worked dandy until I actually had
>      to fsck.
>
>      Even though things like background fsck exist, it's never been stable
>      enough to be practical in a production environment, and even if it were
>      it eats disk bandwidth potentially for days after a crash.  I don't
>      know if that has changed recently or not.
>
>      The only work around is to not store tens of millions of inodes on a
>      UFS filesystem.
>
>    * I believe that FreeBSD was talking about adopting some of the LFS work,
>      or otherwise implementing log space for UFS.  I don't know what the
>      state of this is but I will say that it's tough to get something like
>      this to work right without a lot of actual plug-pulling tests.
>
>    Either OpenBSD or NetBSD I believe have a log structured extension to
>    UFS which works.  Not sure which, sorry.
>
>    With something like ZFS one would use ZFS's snapshots (though they aren't
>    as fine-grained as HAMMER snapshots).  ZFS's snapshots work fairly well
>    but have higher maintanance overheads then HAMMER snapshots when one is
>    trying to delete a snapshot.  HAMMER can delete several snapshots in a
>    single pass so the aggregate maintainance overhead is lower.
>
>    With Linux... well, I don't know which filesystem you'd use.  ext4 maybe,
>    if they've fixed the bugs.  I've used reiser in the past (but obviously
>    that isn't desireable now).
>
>    --
>
>    For HAMMER, both Justin and I have been able to fill up multi-terrabyte
>    filesystems running bulk pkgsrc builds with default setups.  It's fairly
>    easy to fix by adjusting up the HAMMER config (aka hammer viconfig
>    <filesystem>) run times for pruning and reblocking.
>
>    Bulk builds are a bit of a special case.  Due to the way they work a
>    bulk build rm -rf's /usr/pkg for EACH package it builds, then
>    reconstructs it by installing the necessary dependencies previously
>    created before building the next package.  This eats disk space like
>    crazy on a normal HAMMER mount.  It's more managable if one did a
>    'nohistory' HAMMER mount but my preference, in general, is to use a
>    normal mount.
>
>    HAMMER does not implement redundancy like ZFS, so if redundancy is
>    needed you'd need to use a RAID card.  For backup systems I typically
>    don't bother with per-filesystem redundancy since I have several copies
>    on different machines already.  Not only do the (HAMMER) production
>    machines have somewhere around 60 days worth of snapshtos on them,
>    but my on-site backup box has 100 days of daily snapshots and my
>    off-site backup box has almost 2 years of weekly snapshots.
>
>    So if the backups fit on one or two drives additional redundancy isn't
>    really beneficial.  More then that and you'd definitely want RAID.
>
>                                        -Matt
>                                        Matthew Dillon
>                                        <dillon at backplane.com>
>
>