You could do worse than Mach ports
Matthew Dillon
dillon at apollo.backplane.com
Thu Jul 17 14:08:47 PDT 2003
:I really disagree with this (big surprise 8-)).
:
:The problem with this model is that you can only represent a small
:subset of the filesystem types, and almost none of the interesting
:ones.
:
:For anything that does anything interesting, you are going to have
:to map file data as well as metadata pages into the address space
:of whoever handles operations on the vp; specifically, here are the
:classes of manipulations, with a few examples of each:
Yes, but there is a big difference in KVM needs when mapping meta-data
verses mapping file data. Regardless of how meta-data is managed, the
critical path always has been and always will be regular file reads and
writes, and directory lookups, and file read/write generally eats
directory lookup for lunch. Remember, before I got vmiodir working
the buffer cache dedicated very little space to caching directory meta
data and the performance loss really wasn't all that terrible for most
normal configurations.
Even something like cryptfs can conceivably, if a crypto chip is
present, use DMA to directly access the data without it having to be
mapped. And if it isn't, so what? You wind up mapping the data and
take the performance hit. Just because you might have to do it for
some subsystems doesn't mean you should go and burden *ALL* the
subsystems with that kind of overhead. Also, data mapping and range
locking are two entirely separate beasts. You do not always need to
range lock something you are mapping, and you do not always need to
map something you are range-locking. So integrating those functions into
the core messaging system just adds more conditionals the critical path,
making the core messaging system less efficient without any peformance
improvements to show for it.
It is important to keep interface APIs as simple as possible. For example,
the core messaging API does not require message sizes to be specified or
memory objects to be registered. That isn't its job. If I want to
transition a core message across a protection layer I do it via the
port agent when the message is sent. That way the *SAME* device driver
could operate in kernelland or in userland and be optimal in both. The
kernelland version's port agent would not have to translate messages
across a protection boundary, the userland version's port agent would.
In fact, if several such devices were all operating in the same userland
process their port agent's would not have to do any translation within
userland, either.
But the device driver itself would work the same either way, and
the best case winds up being the most optimal case. That is what
is important. If I burden the messaging API with message sizes,
registered memory objects, data space conversions, data mappings, and
all sorts of insundry other mach-like stuff I impose a severe burden
on the 'best' case message passing situation which unnecessarily slows
down the system and makes the messaging API far less useful because it
would no longer be a light weight mechanism and thus no longer be
suitable for light weight operations.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list