You could do worse than Mach ports
Matthew Dillon
dillon at apollo.backplane.com
Fri Jul 18 10:46:44 PDT 2003
:> In the quota case,
:>you have to take a pretty heavy-weight view of what constitutes
:>memtadata, as well, since you have to deal with file extends
:>via write, which means you need to peek at the bytecount for
:>payload, and return value in case of success/failure.
:
:OK, either you or I are very confused here.
:
:Matt: could you comment on this?
:
:But these are in the message, not in the data blocks. You have
:access to the message... it was copied into the kernel with the
:Send()... you just need to take a protection barrier transition to
:access any data pointed to by the message.
The issue comes down to how to deal with foreign address spaces.
After all, if the address space is local one just passes a pointer.
For system calls the foreign address space is the user process's
address space. User data pointers come in three forms:
(1) They represent a file path
(2) They represent a large block of data (e.g. the buffer in a read())
(3) They represent a small block of data (e.g. gettimeofday()).
(4) They represent the message itself
In case 1 I will be rewriting the VFS cache to completely evaluate any
user path and create the appropriate nodes in the VFS cache tree. This
will occur before entry into the VFS layer and, of course, any nodes
that are not known in the cache will be given an 'UNKNOWN' designation
and will have to be resolved by VFS (e.g. VFS_LOOKUP()). VFS_LOOKUP()
would no longer access userspace path elements directly. A secondary
advantage of integrating VFS_LOOKUP() with the VFS cache is that we
would no longer have to leave directory vnodes locked during the
traversal, the path would be locked through the VFS cache.
In case 2 I intend to convert the user data pointer into an IOVEC array
of VM Object's and offset ranges within those objects. For example,
(vnode->object, start_offset_of_read, extent_of_read). The kernel
would pass the IOVEC array around just like it passes UIO's around now
(in fact, we are really talking about UIO's with their guts rearranged
in this new scheme).
Any kernel entity that actually needs to read the data directly will
have to map the object into its data space. We might or might not
cache such mappings in the UIO just so in a multi-layered VFS system
where multiple layers need a direct mapping, only one mapping is actually
made.
In case 3 I intend for the syscall to copyin/copyout the data (just like
it does now).
In case 4 the kernel currently copies the syscall argments into kernel
space already (the uap family of structures), and we would do the same
with the message. Just think of the message as being the syscall
arguments themselves. We need to do this anyway because the kernel
version of the syscall message is going to be more involved then the
user version.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list