hammer mirroring question

Fri Nov 7 01:35:34 PST 2008

Matthew Dillon <dillon at apollo.backplane.com> wrote:
> 
> :Hi,
> :
> :I'm playing with the hammer mirroring feature and noticed
> :that streams generated by
> :hammer mirror-read filesystem <begin-tid>
> :don't always start with <begin-tid>. 
> :
> :E.g:
> :
> :root at blob:/home/hofmann >hammer mirror-read /hammer/ 0x000000010382d1a6 | hammer mirror-dump 
> :...
> :Record obj=0000000100000054 key=0000000000000000 rt=01 ot=02
> :       tids 00000001061da24b:0000000000000000 data=128
> :Record obj=0000000100000055 key=0000000000000000 rt=01 ot=01
> :       tids 00000001061da25d:0000000000000000 data=128
> :..
> :
> :But 0x000000010382d1a6 is a valid existing tid and:
> :
> :root at blob:/home/hofmann >hammer mirror-read /hammer/ 0x000000010382d1a6 | hammer mirror-dump | grep 000000010382d1a6
> :Mirror-read: Mirror from 000000010382d1a6 to 0000000f958af3c0
> :       tids 000000010382d1a6:0000000000000000 data=128
> :
> :Is this intended?
> :
> :  Johannes
> 
>    The mirroring dump should include all records with a creation or
>    deletion TID >= the specified TID.
> 
>    BUT, it may ALSO include records with lower TIDs.  The reason is because
>    the code needs to supply the B-Tree infrastructure leading up to the
>    desired records as well as provide the desired records.  It is a side
>    effect of the search.  Providing the infrastructure helps the mirroring
>    target do the merge (including any needed deletions) optimally.
> 
>    The search is still optimal, or close to it.  You should not get too
>    many extra records (from a bulk transfer point of view).
> 
>    The various mirroring record types:  'Skip', 'Pass', and 'Record',
>    are used to discern the difference between infrastructure and bulk
>    data records. 
> 
>    * Skip records indicate that part of the B-Tree infrastructure is being
>      skipped and only contain the key range being skipped. 
> 
>    * Pass records are records which the originator believes the target
>      should already have.  The record header is included but not any data
>      references.
> 
>    * Record records are records (tripple play there :-)) that the original
>      believes the target might not have.  These records contain everything:
>      key, record header, and any associated bulk data.
> 
>    The mirroring target uses these records to optimally scan the target
>    B-Tree in the target HAMMER filesystem and to properly perform the
>    merge.
> 
>    Because transaction ids are not really in any sort of sorted error,
>    except for create_tid as a sub-sort, we can end up dumping records,
>    particular 'Pass' records, with unrelated transaction ids in order
>    to include a 'Record' record with a related transaction id, so the
>    mirroring target knows how to properly merge the stream into the
>    target.  i.e. the mirroring target needs to know whether it must delete
>    physical records on the target or not when performing a merge, and it
>    can't know that unless it is given all the records in the B-Tree leaf,
>    even if some are outside the requested transaction id range.

Thanks a lot for the explanation!

  Johannes