Initial filesystem design synopsis.
dillon at apollo.backplane.com
Mon Feb 26 14:51:55 PST 2007
:I have always wondered about the endian neutral thing.
:It seems somewhat pointless to me. In a setup where you are swapping
:file systems between big and little endian boxes conversion will go on,
:you're not saving anything over the long run. The cost of conversion
:should be trivial in this day and age, particularly when weighed against
:the cost of I/O.
:In fact it seems to me as though predicating the conversion process
:on a runtime variable would actually cost *more* than making the
:conversion decision at compile time. As a pernickerty point there
:would be a little more code to go wrong and more test cases to chase.
:This lappy (Intel U1400) can endian swap 512k 32bit words in 10ms,
:compiled with -O0 and profiled with gprof, and sub 10ms with -O2
:(really should check the .asm but I don't have the time right now).
:Please correct me if I am wrong ! :)
The run-time variable is simply identifying what endianess the
filesystem is using, allowing the filesystem image (or pieces of it)
to be transported between boxes with different endianess without them
blowing up. Machine architectures change all the time, but storage
is forver (or at least for longer)... It needs to 'just work'.
Most clusters will be made up of boxes using the same endianess, and
the filesystem will be formatted for that self same endianess. They
will look at the variable and say 'ho hum, that's already the format
I use natively so I don't have to do any translation at all'.
It is particularly important in any sort of clustering system or
protocol (or protocol used by a clustering system) to allow boxes with
different endianess to talk to each other, even if it means taking a
small performance hit in that communication. Anyone remember the
'talk' bug from 20+ years ago? To this day I do not think Sun ever
fixed the byte ordering for 'talk' that made their talk incompatible
with everyone elses. We are not going to repeat that mistake.
The alternative is to set a specific endianess for the filesystem in
stone, just like the 'network byte order' concept locked in a
particular byte ordering for packets. People today STILL hate the
fact that they have to translate certain protocols to network byte
order even when the machines on both ends use the same byte order,
but one that happens to not be 'network' byte order.
We're not making that same mistake, hence the filesystem will be speced
for endian neutrality even if the code isn't written to support it right
off the bat.
<dillon at backplane.com>
More information about the Kernel