hammer mirroring question

Matthew Dillon dillon at apollo.backplane.com
Wed Nov 5 14:11:21 PST 2008


:Hi,
:
:I'm playing with the hammer mirroring feature and noticed
:that streams generated by
:hammer mirror-read filesystem <begin-tid>
:don't always start with <begin-tid>. 
:
:E.g:
:
:root at blob:/home/hofmann >hammer mirror-read /hammer/ 0x000000010382d1a6 | hammer mirror-dump 
:...
:Record obj=0000000100000054 key=0000000000000000 rt=01 ot=02
:       tids 00000001061da24b:0000000000000000 data=128
:Record obj=0000000100000055 key=0000000000000000 rt=01 ot=01
:       tids 00000001061da25d:0000000000000000 data=128
:..
:
:But 0x000000010382d1a6 is a valid existing tid and:
:
:root at blob:/home/hofmann >hammer mirror-read /hammer/ 0x000000010382d1a6 | hammer mirror-dump | grep 000000010382d1a6
:Mirror-read: Mirror from 000000010382d1a6 to 0000000f958af3c0
:       tids 000000010382d1a6:0000000000000000 data=128
:
:Is this intended?
:
:  Johannes

    The mirroring dump should include all records with a creation or
    deletion TID >= the specified TID.

    BUT, it may ALSO include records with lower TIDs.  The reason is because
    the code needs to supply the B-Tree infrastructure leading up to the
    desired records as well as provide the desired records.  It is a side
    effect of the search.  Providing the infrastructure helps the mirroring
    target do the merge (including any needed deletions) optimally.

    The search is still optimal, or close to it.  You should not get too
    many extra records (from a bulk transfer point of view).

    The various mirroring record types:  'Skip', 'Pass', and 'Record',
    are used to discern the difference between infrastructure and bulk
    data records. 

    * Skip records indicate that part of the B-Tree infrastructure is being
      skipped and only contain the key range being skipped. 

    * Pass records are records which the originator believes the target
      should already have.  The record header is included but not any data
      references.

    * Record records are records (tripple play there :-)) that the original
      believes the target might not have.  These records contain everything:
      key, record header, and any associated bulk data.

    The mirroring target uses these records to optimally scan the target
    B-Tree in the target HAMMER filesystem and to properly perform the
    merge.

    Because transaction ids are not really in any sort of sorted error,
    except for create_tid as a sub-sort, we can end up dumping records,
    particular 'Pass' records, with unrelated transaction ids in order
    to include a 'Record' record with a related transaction id, so the
    mirroring target knows how to properly merge the stream into the
    target.  i.e. the mirroring target needs to know whether it must delete
    physical records on the target or not when performing a merge, and it
    can't know that unless it is given all the records in the B-Tree leaf,
    even if some are outside the requested transaction id range.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Users mailing list