hammer prune explanation

Matthew Dillon dillon at apollo.backplane.com
Sat May 10 14:03:11 PDT 2008

:Yeah, I was thinking about wildcarding as well.
:But is it possible to implement it within cmd_prune.c, or do I have to
:modify the ioctl kernel code? If done in cmd_prune.c, I somehow have to
:iterate over all deleted files and call the prune command for it.
:I thought, it's easier to introduce a check in the kernel, whether the
:file that should be pruned matches a given pattern. Doesn't sound very
:hard to do, if it is easy to get the pathname for a given inode.
:Are you thinking about something like the archive flag?

    I think it is probably best to implement that level of sophistication
    in the utility rather then in the kernel.  The pruning ioctl code
    has no concept of files or directories... literally it has no concept.
    All it understands, really, are object id's (aka inode numbers) and

    The hammer utility on the other hand can actually scan the filesystem

    Locating wholely deleted files and directories is not hard to do.
    As-of queries can be used to access earlier versions of a directory.

    We might want to add some kernel support to make it more efficient,
    for example to make it possible for the hammer utility to have
    visibility into all deleted directory entries.  It could use that
    visbility to do as-of accesses and through that mechanic would thus
    have visibility into all deleted files and directories.

    Inode numbers are never reused, so the inode number (and hence object
    id) of a deleted file will be just as unique as the inode number 
    for one that is still visible.

: >     Right now any serious HAMMER user need to set up at least a daily
: >     cron job to prune and reblock the filesystem.  I add a '-t timeout'
: >     feature to the HAMMER utility to make allow the operations to be
: >     set up in a cron job and keep the filesystem up to snuff over a long
: >     period of time.  So, e.g. you would have a nightly cron job that
: >     did this:
: >
: > 	# spend up to 5 minutes pruning the filesystem and another
: > 	# 5 minutes reblocking it, then stop.
: > 	hammer -t 300 prune /myfilesystem; hammer -t 300 reblock /myfilesystem
:Does this degrade filesystem seriously?
:   Michael

    For the time it is running it will be maxing out the filesystem, e.g.
    similar to doing a 'find / ...'.  The idea is to limit the run time
    (hence the -t) so your nightly cron job does a small chunk of the
    filesystem every night, resulting in a clean well ordered filesystem
    over a long period of time.  So, for example, spend 10 minutes a day
    doing housekeeping.  Filesystems are rarely operating at 100% 24x7 and
    there are other ways to spread out the overhead if it became necessary
    to do so.  Usually picking a chunk of time during off-hours is sufficient.

    The reblocking code is very efficient when it doesn't have much to do,
    meaning that it will very quickly skip over blocks that have already
    been reblocked.

    The pruning code is not quite as efficient, it must scan the B-Tree
    within the object range specified (typically the whole tree), but it
    will still be able to scan things very quickly until it hits B-Tree
    nodes that require pruning.

    This means that it is effectively incremental given a long enough
    time period, and could be made incremental for real by adding an option
    to the hammer prune utility to adjust the starting object id to pick
    up where it left off last time.

					Matthew Dillon 
					<dillon at backplane.com>

More information about the Users mailing list