Initial filesystem design synopsis.
Jason Smethers
jason at smethers.net
Mon Feb 26 21:44:11 PST 2007
02.04.39.273032 at hotmail.com>
In-Reply-To: <pan.2007.02.27.02.04.39.273032 at hotmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 69
Message-ID: <45e3c52e$0$833$415eb37d at crater_reader.dragonflybsd.org>
NNTP-Posting-Host: 72.177.222.117
X-Trace: 1172555054 crater_reader.dragonflybsd.org 833 72.177.222.117
Xref: crater_reader.dragonflybsd.org dragonfly.kernel:10697
Rupert Pigott wrote:
> On Mon, 26 Feb 2007 14:37:15 -0800, Matthew Dillon wrote:
>> The alternative is to set a specific endianess for the filesystem in
>> stone, just like the 'network byte order' concept locked in a
>> particular byte ordering for packets. People today STILL hate the
>> fact that they have to translate certain protocols to network byte
>> order even when the machines on both ends use the same byte order,
>> but one that happens to not be 'network' byte order.
>
> That doesn't seem like a good enough reason to add more conditional
> branches to the critical paths, make the wire protocol more complicated
> and make it more difficult for the Mk.I eyeball to parse the dumps. It
> also pushes an additional conditional into at runtime. A fixed-endian
> filesystem makes that conditional "free" and therefore reduces the
> performance hit - and makes the life of the compiler trivial.
It's one branch at load and one branch at store for meta-data. The
endian is not likely to change once the filesytem is mounted; therefore,
in the critical path the processor's branch prediction should almost
always predict the correct path taken or not taken.
Besides, the meta data is most likely to already be in memory, and thus
you're down to just the branch on store.
If it really is an issue, you can always do per filesystem function
pointers to either a nop function or the correct endian translation
function. With today's processors which do not yet include good indirect
branch support (i.e. pointer based function calls), this would make
the best case always as slow as a branch prediction which "always
misses". This will of course likely improve with the next generation of
processors, such as AMD's K8L which include indirect branch prediction.
Changes between system endians will likely be very rare anyways. Best
case is there would be an "endian field" in the meta-data which is
always set to little-endian. Implementations which do not support endian
changes would simply always fail to mount.
A seperate tool may be made which does endian conversion on media before
mounting on a different endian system. Then, the compiler would always
optimize the kernel filesytem code to only support the endian of the
computer it is compiled for.
We could also do a per segment conversion of meta-data at run time as
the segment is accessed, but the need for conversion will likely be rare
anyways. Such conversions may require additional moving of data to other
"already converted segments" since we would likely not want to overwrite
existing meta-data to insure transactional safety. A background thread
could also do the conversion as necessary starting with "unused"
segments, and do "performance optimization" at the same time.
As far as I know, there is no good reason today for a microprossor
designer to choose one endian over another from a design standpoint.
Today, it more or less comes down to preference. Therefore, it is highly
unlikely to see another processor architecture which is not big-endian
or little-endian.
As for network protocols, I prefer negotiating the endian such that the
endian of the "server" or "peer under higher load" gets preference.
Then, the "client" or "peer under lower load" takes on the burden of
conversion. Then developer only has to worry about endian changes to
interface with the network API. For most developers, this means that
communication will almost always take place in little-endian between x86
machines.
As far a human readable binary, that's why we should build tools to
format the data. =)
- Jason
More information about the Kernel
mailing list