Resolving data_offset and data_len

Thu Jun 4 10:37:00 PDT 2009

Hello,

> cursor.leaf->base.key is the logical *end* (in bytes) of the extent
> cursor.leaf->data_len is the length of the extent in bytes
> cursor.leaf->data_offset is the HAMMER internal offset (see below)

I am not sure I understand extents in HAMMER. Let me try:

Suppose we had a file of 24K. This file would be broken up into two
parts of 16K and 8K respectively. In HAMMER, these parts are called
``data records''. Contiguous data records are called ``extents''
(contiguous relative to what? to the underlying block device?).

Each data record is stored as a leaf node in the B-tree. To obtain all
leaf nodes belonging to a file, a ``cursor'' is initalized with the
following values:

  cursor.key_beg.obj_id     <-- the i-number
  cursor.key_beg.key        <-- is that a byte offset? relative to what?
  cursor.asof               <-- what's that for? history/version?
  cursor.key_beg.rec_type = HAMMER_RECTYPE_DATA <-- we want a file

hammer_ip_first() and hammer_ip_next() are used to iterate through all
leaf nodes.

Is that right? (reference:
http://www.dragonflybsd.org/presentations/nycbsdcon08/img9.png)

1. Does cursor.leaf->base.key indicate the end of the current data
record or the *entire file*?

2. For a given file of 12 bytes, cursor.leaf->data_len gives me 16. What
am I missing?

> It is a HAMMER_ZONE_LARGE_DATA offset.  Disk offsets in HAMMER are all
> 64bit, but do not directly refer to byte offsets on the disk.  Instead
> the offsets have multiple layers (or at least one) of indirections.  See
> hammer_disk.h:120 for reference.  Basically the right thing to do is use
> hammer_io_direct_read().

Thanks, hammer_io_direct_read() is what I was looking for. Now I can
calculate the right offset to locate the file on disk. But I'm still
struggling with the file length.

Daniel