More syscall messaging commits, and some testing code as well.
Matthew Dillon
dillon at apollo.backplane.com
Mon Aug 11 19:47:07 PDT 2003
I just committed another bunch of syscall messaging stuff, plus I
also committed some test code for it in /usr/src/test. This is ad-hoc
test code and committers are welcome to throw in their own testing
code in that directory willy-nilly :-)
In this commit I have managed to asynchronize nanosleep(), but there
are still a bunch of issues that have to be worked through. For
example, we need resource limits on the number of outstanding system
calls we allow to be in-progress and there needs to be a mechanism to
abort system calls which are in-progress when a program is killed.
Right now neither exists and ^Cing a test program at the wrong time will
very definitely crash the system, so asynch messaging syscalls are
currently restricted to root-use only.
The system call argument format can be observed in sys/sysproto.h. The
basic structure is, for example:
struct read_args {
#ifdef _KERNEL
union sysmsg sysmsg;
#endif
union usrmsg usrmsg;
int fd; char fd_[PAD_(int)];
void * buf; char buf_[PAD_(void *)];
size_t nbyte; char nbyte_[PAD_(size_t)];
};
As you can see it consists of three pieces:
(1) The kernel message representing the system call
(2) The original message copied from userspace
(3) The system call arguments copied from userspace
I am currently investigating how best to split up system call
operation. A system call must do the following:
* extract any additional data from userland. For example, nanosleep()
has to copyin() the timespec structure whos pointer is provided in
the call arguments.
At the moment the nanosleep() code extracts the timespec structure
from userland and stores it in 'sysmsg', so the execution phase
operates entirely on the contents sysmsg and not on usrmsg or the
arguments.
* execution phase, operating on data entirely within kernel space
(except for I/O calls of course).
* writeback phase. e.g. nanosleep() may have to copyout() a timespec
structure back to userspace.
The question we face is whether it makes sense to separate the phases
in order to isolate the execution phase, which would allow system calls
to be made 'from' a kernel thread as a matter of course, rather then as
a special case.
--
Finally, note that the code is VERY messy. There will be some severe
cleanups as time goes on. I believe I have partioned the functionaly
such that we can really get some nice clean code out of the API.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list