Initial filesystem design synopsis.

Matthew Dillon dillon at apollo.backplane.com
Mon Feb 26 08:59:40 PST 2007


:What would the relationship between system memory size and segment size 
:be?  From your description, it sounds like you accumulate transactions 
:until a segment is "full", and then push the segment to disk.  So a 
:machine with limited memory resources would be forced to push segment 
:more often.
:
:        Max

    No relationship at all.  The segment can be much larger then system
    memory.   The limitation is that I want to use 32 bit addressing for
    intra-segment references and guarentee a limited number of data records
    (e.g. 4G / 64 bytes per record == 64MB worth of records per segment,
    maximum, which DOES fit in memory for recovery purposes).

    Segment synchronization works by having a forward-running indexing
    and a backwards-running index in the segment header.  Data blocks are
    'allocated' forwards and records are allocated backwards, and the segment
    becomes full when the two indexes meet.

    This also means that data and records can be flushed to disk in any
    order, just as long as the update to the indexes in the segment
    *HEADER*  is ordered (occurs after all related data and records have
    completed their I/O).

    That's the basic idea.  It's very important to reduce ordering
    constraints for disk I/O.

    Space reservation within a segment is also important in a cluster FS.
    Even though individual replication targets have their own local copy
    of the filesystem, in a multi-master environment the replication
    targets must be able to master operations and there will be cases where
    multiple replication targets will master operations within the same 
    segment.  The ordering of parallel transactions is irrelevant (there
    being no dependancy) and so can be stored in different orders on 
    different targets and then replicated across.  BUT the space must still
    be reserved so parallel transactions occuring in the same segment do
    not run the segment out of space in weird unrecoverable ways.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Kernel mailing list