hammer prune explanation
Matthew Dillon
dillon at apollo.backplane.com
Sat May 10 14:03:11 PDT 2008
:Yeah, I was thinking about wildcarding as well.
:
:But is it possible to implement it within cmd_prune.c, or do I have to
:modify the ioctl kernel code? If done in cmd_prune.c, I somehow have to
:iterate over all deleted files and call the prune command for it.
:
:I thought, it's easier to introduce a check in the kernel, whether the
:file that should be pruned matches a given pattern. Doesn't sound very
:hard to do, if it is easy to get the pathname for a given inode.
:
:Are you thinking about something like the archive flag?
I think it is probably best to implement that level of sophistication
in the utility rather then in the kernel. The pruning ioctl code
has no concept of files or directories... literally it has no concept.
All it understands, really, are object id's (aka inode numbers) and
records.
The hammer utility on the other hand can actually scan the filesystem
hierarchy.
Locating wholely deleted files and directories is not hard to do.
As-of queries can be used to access earlier versions of a directory.
We might want to add some kernel support to make it more efficient,
for example to make it possible for the hammer utility to have
visibility into all deleted directory entries. It could use that
visbility to do as-of accesses and through that mechanic would thus
have visibility into all deleted files and directories.
Inode numbers are never reused, so the inode number (and hence object
id) of a deleted file will be just as unique as the inode number
for one that is still visible.
: > Right now any serious HAMMER user need to set up at least a daily
: > cron job to prune and reblock the filesystem. I add a '-t timeout'
: > feature to the HAMMER utility to make allow the operations to be
: > set up in a cron job and keep the filesystem up to snuff over a long
: > period of time. So, e.g. you would have a nightly cron job that
: > did this:
: >
: > # spend up to 5 minutes pruning the filesystem and another
: > # 5 minutes reblocking it, then stop.
: > hammer -t 300 prune /myfilesystem; hammer -t 300 reblock /myfilesystem
:
:Does this degrade filesystem seriously?
:
:Regards,
:
: Michael
For the time it is running it will be maxing out the filesystem, e.g.
similar to doing a 'find / ...'. The idea is to limit the run time
(hence the -t) so your nightly cron job does a small chunk of the
filesystem every night, resulting in a clean well ordered filesystem
over a long period of time. So, for example, spend 10 minutes a day
doing housekeeping. Filesystems are rarely operating at 100% 24x7 and
there are other ways to spread out the overhead if it became necessary
to do so. Usually picking a chunk of time during off-hours is sufficient.
The reblocking code is very efficient when it doesn't have much to do,
meaning that it will very quickly skip over blocks that have already
been reblocked.
The pruning code is not quite as efficient, it must scan the B-Tree
within the object range specified (typically the whole tree), but it
will still be able to scan things very quickly until it hits B-Tree
nodes that require pruning.
This means that it is effectively incremental given a long enough
time period, and could be made incremental for real by adding an option
to the hammer prune utility to adjust the starting object id to pick
up where it left off last time.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list