Initial filesystem design synopsis.

Matthew Dillon dillon at apollo.backplane.com
Mon Feb 26 14:51:55 PST 2007


:I have always wondered about the endian neutral thing.
:
:It seems somewhat pointless to me. In a setup where you are swapping
:file systems between big and little endian boxes conversion will go on,
:you're not saving anything over the long run. The cost of conversion
:should be trivial in this day and age, particularly when weighed against
:the cost of I/O.
:
:In fact it seems to me as though predicating the conversion process
:on a runtime variable would actually cost *more* than making the
:conversion decision at compile time. As a pernickerty point there
:would be a little more code to go wrong and more test cases to chase.
:
:This lappy (Intel U1400) can endian swap 512k 32bit words in 10ms,
:compiled with -O0 and profiled with gprof, and sub 10ms with -O2
:(really should check the .asm but I don't have the time right now).
:
:Please correct me if I am wrong ! :)
:
:Regards,
:Rupert

    The run-time variable is simply identifying what endianess the
    filesystem is using, allowing the filesystem image (or pieces of it)
    to be transported between boxes with different endianess without them
    blowing up.   Machine architectures change all the time, but storage
    is forver (or at least for longer)... It needs to 'just work'.

    Most clusters will be made up of boxes using the same endianess, and
    the filesystem will be formatted for that self same endianess.  They
    will look at the variable and say 'ho hum, that's already the format
    I use natively so I don't have to do any translation at all'.

    It is particularly important in any sort of clustering system or
    protocol (or protocol used by a clustering system) to allow boxes with
    different endianess to talk to each other, even if it means taking a
    small performance hit in that communication.  Anyone remember the
    'talk' bug from 20+ years ago?  To this day I do not think Sun ever
    fixed the byte ordering for 'talk' that made their talk incompatible
    with everyone elses.  We are not going to repeat that mistake.

    The alternative is to set a specific endianess for the filesystem in
    stone, just like the 'network byte order' concept locked in a 
    particular byte ordering for packets.  People today STILL hate the
    fact that they have to translate certain protocols to network byte
    order even when the machines on both ends use the same byte order,
    but one that happens to not be 'network' byte order.

    We're not making that same mistake, hence the filesystem will be speced
    for endian neutrality even if the code isn't written to support it right
    off the bat.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Kernel mailing list