Some questions on HAMMER internals

Tomohiro Kusumi kusumi.tomohiro at gmail.com
Sat Apr 25 09:19:12 PDT 2015


2015-03-16 2:57 GMT+09:00 Vasily Postnicov <shamaz.mazum at gmail.com>:

> Hello.
>
> I have read that data corresponding to any file in HAMMER fs is stored in
> 8-megabyte blocks (I believe, they are called big-blocks). Also, all
> inodes, directory entries, etc. are indexed in a global per-FS B+tree. If
> one, for example, changes a file, a new element in the tree is created,
> starting with new "create transaction id" and an old element's delete
> transaction id is updated, so, in this way, the history is maintained.
>
> Suppose then, that I read from a file with path /hammer_mountpoint/a or
> from older version of the same file /hammer_mountpoint/@@0x<tid>/a. So how
> corresponding data in big blocks can be found? If a search in the B-tree is
> performed, then what key is used?
>
>
how: you search the per-fs (not per-pfs) btree using whatever keys in
question, including your tid.
what: see hammer_btree_cmp() in sys/vfs/hammer/hammer_btree.c

whatever you do to the fs (e.g. read(2), write(2), etc) will eventually
make a query to the ondisk btree (or inmemory rbtree if possible), and
whatever stored ondisk is storage space obtained from big block allocator.



> Also, each element in the tree has a 4-byte "localization" field. The
> first two bytes is a PFS id. What are the last two? What are "rt", "ot",
> "key" and "dataof" fields shown by "hammer show" command? Is that correct,
> that PFSes have their own obj and inode space, so if I mirror one PFS to
> another, B-tree will have elements with the same obj fields, but with
> different localization?
>


the upper 16 bits (localization >> 16) is a pfs id whether 'localization'
is the one from ondisk node, or the one from inmemory inode member.
the lower 16 bits of the localization is a type field, either inode or not
inode.

the upper 16 bits of the localization is made on pfs initialization, so if
two pfs (e.g. master and slave) have different pfs id, then they do have
different localization value whether you mirror-copy it or do something
else.



>
> Can anybody again explain what "fake transaction id" means? I read man 1
> undo, but still cannot get it.
>
> And the last question (it's the reason why I try to understand HAMMER a
> bit more): I cannot access a file in a snapshot generated by "hammer snapq"
> command. "undo -ai" shows many fake transaction ids and kernel prints a
> message:
>
> HAMMER: WARNING: Missing inode for dirent "midori"
>         obj_id = 0000000272ed2679, asof =0000000280c49ec0, lo=00030000
>
> It can happen for an arbitrary tid, but how can it be for a snapshot tid
> (in my case, 0x0000000280c49ec0)? Current versions of all files seems to
> be OK. Should I send a bug report?
>
>       With regards, Vasily.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/hammer/attachments/20150426/7e2930a8/attachment-0002.htm>


More information about the Hammer mailing list