hammer mirroring question
Johannes Hofmann
johannes.hofmann at gmx.de
Fri Nov 7 01:35:34 PST 2008
Matthew Dillon <dillon at apollo.backplane.com> wrote:
>
> :Hi,
> :
> :I'm playing with the hammer mirroring feature and noticed
> :that streams generated by
> :hammer mirror-read filesystem <begin-tid>
> :don't always start with <begin-tid>.
> :
> :E.g:
> :
> :root at blob:/home/hofmann >hammer mirror-read /hammer/ 0x000000010382d1a6 | hammer mirror-dump
> :...
> :Record obj=0000000100000054 key=0000000000000000 rt=01 ot=02
> : tids 00000001061da24b:0000000000000000 data=128
> :Record obj=0000000100000055 key=0000000000000000 rt=01 ot=01
> : tids 00000001061da25d:0000000000000000 data=128
> :..
> :
> :But 0x000000010382d1a6 is a valid existing tid and:
> :
> :root at blob:/home/hofmann >hammer mirror-read /hammer/ 0x000000010382d1a6 | hammer mirror-dump | grep 000000010382d1a6
> :Mirror-read: Mirror from 000000010382d1a6 to 0000000f958af3c0
> : tids 000000010382d1a6:0000000000000000 data=128
> :
> :Is this intended?
> :
> : Johannes
>
> The mirroring dump should include all records with a creation or
> deletion TID >= the specified TID.
>
> BUT, it may ALSO include records with lower TIDs. The reason is because
> the code needs to supply the B-Tree infrastructure leading up to the
> desired records as well as provide the desired records. It is a side
> effect of the search. Providing the infrastructure helps the mirroring
> target do the merge (including any needed deletions) optimally.
>
> The search is still optimal, or close to it. You should not get too
> many extra records (from a bulk transfer point of view).
>
> The various mirroring record types: 'Skip', 'Pass', and 'Record',
> are used to discern the difference between infrastructure and bulk
> data records.
>
> * Skip records indicate that part of the B-Tree infrastructure is being
> skipped and only contain the key range being skipped.
>
> * Pass records are records which the originator believes the target
> should already have. The record header is included but not any data
> references.
>
> * Record records are records (tripple play there :-)) that the original
> believes the target might not have. These records contain everything:
> key, record header, and any associated bulk data.
>
> The mirroring target uses these records to optimally scan the target
> B-Tree in the target HAMMER filesystem and to properly perform the
> merge.
>
> Because transaction ids are not really in any sort of sorted error,
> except for create_tid as a sub-sort, we can end up dumping records,
> particular 'Pass' records, with unrelated transaction ids in order
> to include a 'Record' record with a related transaction id, so the
> mirroring target knows how to properly merge the stream into the
> target. i.e. the mirroring target needs to know whether it must delete
> physical records on the target or not when performing a merge, and it
> can't know that unless it is given all the records in the B-Tree leaf,
> even if some are outside the requested transaction id range.
Thanks a lot for the explanation!
Johannes
More information about the Users
mailing list