PFS in HAMMER1

Tomohiro Kusumi kusumi.tomohiro at gmail.com
Wed Jun 15 17:26:52 PDT 2016


There was discussion about @@ in PFS path on irc channel a few days
ago. This post has nothing to do with it, but explains what that @@
really means. This is very tricky, so most users probably had
difficult time understanding what this @@ means.

Note that non-master (slave) PFS is a bit different, but this post
only explains master PFS.


---
First, create fs and mount the fs.

[root@]~# newfs_hammer -L TEST /dev/da1 > /dev/null
[root@]~# mount_hammer /dev/da1 /HAMMER
[root@]~# cd /HAMMER


Create a PFS within that fs. Note that the location does matter when
there are >1 HAMMER fs mounted like in this case. If you want to
create a PFS in this fs, it needs to be done somewhere under /HAMMER.

[root@]/HAMMER# hammer pfs-master test1 > /dev/null
[root@]/HAMMER# ls -l
total 0
lrwxr-xr-x  1 root  wheel  10 Jun 15 21:15 test1 -> @@-1:00001


In this example, "test" is just a regular symlink created by
/sbin/hammer. Not having a symlink does no harm to HAMMER. You could
rm test1 if you want to as long as you know the existence of
"@@-1:00001" which is PFS#1. /sbin/hammer just makes it for you so you
can see "@@-1:00001" via ls via readlink(2).

[root@]/HAMMER# cd test1
[root@]/HAMMER/test1# touch aaa
[root@]/HAMMER/test1# cd ..
[root@]/HAMMER# rm test1
remove test1? y
[root@]/HAMMER# ls -l
total 0
[root@]/HAMMER# ls @@-1:00001
aaa
[root@]/HAMMER# ln -s @@-1:00001 test1
[root@]/HAMMER# ls test1
aaa


"test1" is actually a symlink to "@@PFS00001" instead of "@@-1:00001".
HAMMER interprets "@@PFS00001" to "@@-1:00001" via readlink(2), so ls
prints it in "@@-1:00001" format instead of "@@PFS00001". In the below
example, test2 is a link to "@@PFS00001", but ls shows "@@-1:00001"
just like it does for test1. "test1" created by /sbin/hammer and
"test2" created by ln point to the same PFS.

[root@]/HAMMER# ls -l
total 0
lrwxr-xr-x  1 root  wheel  10 Jun 15 21:16 test1 -> @@-1:00001
[root@]/HAMMER# ln -s @@PFS00001 test2
[root@]/HAMMER# ls -l
total 0
lrwxr-xr-x  1 root  wheel  10 Jun 15 21:16 test1 -> @@-1:00001
lrwxr-xr-x  1 root  wheel  10 Jun 15 21:18 test2 -> @@-1:00001


"@@-1:00001" is the canonical name for this PFS. This "@@-1:00001"
doesn't physically exist as a directory entry, but this represents the
existence of PFS which is logically separated pseudo filesystem space
within HAMMER's B-Tree. Because "@@-1:00001" doesn't exist as a
directory entry, it never appears in ls result. This is why
/sbin/hammer creates a symlink so as to visualize the existence of
PFS.

HAMMER only has 1 large B-Tree per filesystem (not per PFS), so all
the PFS exist within that single B-Tree. PFS are separated by
localization parameter which is one of the B-Tree keys used to lookup
the tree.

Each substring in "@@-1:00001" means
1. "@@" means it's a PFS or snapshot.
2. "-1" means it's a master.
3. ":" is just a separator.
4. "00001" means it's PFS#1, where PFS#0 is the default PFS created on
newfs. There is no "00000" because that's what's mounted on /HAMMER.
PFS# is used for localization parameter.

Localization parameter has the highest priority when inserting or
looking up B-Tree elements, so fs elements that belong to the same
PFS# tend to be localized (clustered) within the B-Tree as shown
below.

Access to "@@-1:00001" means asking HAMMER to dive into PFS#1's root
inode where all elements of PFS#1 are clustered. Access to
"@@-1:00002" does the same to PFS#2.

             HAMMER root-inode
             /HAMMER
             ////\\
            /  /   \
           /  ...   \
          /  ....    \
         PFS1        PFS2
        ////\\      ////\\
       //////\\    //////\\

Snapshot formats "@@0x00..." works similarly to PFS except that it
filters the B-Tree by transaction id (0x00...) instead of diving into
a clustered subtree.

             HAMMER root-inode
             /HAMMER
             ////\\
            //////\\
           // /////\\
          // //// //\\ <--filtered by TID
         /// //// ///\\
        ///// /// ////\\
       ///   //// /////\\


Since "@@-1:00001" is not a directory entry of any directory, a weird
thing can happen as shown below. cd to a/b/c/d/e/f/g/test1 apparently
fails because there is no such directory entry, but cd to
a/b/c/d/e/f/g/@@-1:00001 doesn't fail. This means you could have
whatever parent directories you want in order to access PFS, because
HAMMER's namei doesn't care about parents of "@@-1:00001" given the
fact that no directory has such entry. I think this is a design issue
of HAMMER1.

[root@]/HAMMER# mkdir -p a/b/c/d/e/f/g
[root@]/HAMMER# cd a/b/c/d/e/f/g/test1
cd: no such file or directory: a/b/c/d/e/f/g/test1
[root@]/HAMMER# cd a/b/c/d/e/f/g/@@-1:00001
[root@]/HAMMER/a/b/c/d/e/f/g/@@-1:00001# pwd
/HAMMER/a/b/c/d/e/f/g/@@-1:00001
[root@]/HAMMER/a/b/c/d/e/f/g/@@-1:00001# ls
aaa


PFS created by DragonFly's installer use both symlink and nullmount.
Things under /pfs are symlinks to PFS created by /sbin/hammer. (Note
that PFS are no longer used by installer as mentioned in
http://lists.dragonflybsd.org/pipermail/users/2015-December/228472.html)

[root@]~# ls -l /pfs
total 0
lrwxr-xr-x  1 root  wheel  10 Aug 26  2015 home -> @@-1:00003
lrwxr-xr-x  1 root  wheel  10 Aug 26  2015 tmp -> @@-1:00002
lrwxr-xr-x  1 root  wheel  10 Aug 26  2015 usr.obj -> @@-1:00004
lrwxr-xr-x  1 root  wheel  10 Aug 26  2015 var -> @@-1:00001
lrwxr-xr-x  1 root  wheel  10 Aug 26  2015 var.crash -> @@-1:00005
lrwxr-xr-x  1 root  wheel  10 Aug 26  2015 var.tmp -> @@-1:00006

These /pfs/xxx symlinks are nullmounted on /xxx,

[root@]~# cat /etc/fstab | grep "/pfs"
/pfs/var                /var            null    rw              0       0
/pfs/tmp                /tmp            null    rw              0       0
/pfs/home               /home           null    rw              0       0
/pfs/usr.obj    /usr/obj                null    rw              0       0
/pfs/var.crash  /var/crash              null    rw              0       0
/pfs/var.tmp    /var/tmp                null    rw              0       0

so it results in something like below.

[root@]~# mount | grep "/pfs"
/pfs/@@-1:00001 on /var (null, local)
/pfs/@@-1:00002 on /tmp (null, local)
/pfs/@@-1:00003 on /home (null, local)
/pfs/@@-1:00004 on /usr/obj (null, local)
/pfs/@@-1:00005 on /var/crash (null, local)
/pfs/@@-1:00006 on /var/tmp (null, local)

Though symlinks are made by /sbin/hammer, symlinks and nullmounts have
nothing to do with HAMMER fs itself. PFS are always accessible by
"@@-1:00001" format because this is the real name of PFS in terms of a
filesystem path, while others are just aliases or loopback mounts
which may or may not exist depending on the configuration.



More information about the Users mailing list