patch #4 (Re: patch #3 (Re: The time has come for a kernel interfacing library layer))

Thu May 12 06:03:31 PDT 2005

On Wed, May 11, 2005 at 05:54:47PM -0700, Matthew Dillon wrote:
>     This will not work.  The problem is that we need to be able to write
>     the __DF_*() system call layer functions in C (the syscall layer is all
>     about translation.  While we can generate assembly for untranslated
>     system calls, we really have to be able to write the ones that ARE
>     translated in C).  We cannot do that safely and still return a carry bit
>     status to the caller.  The carry bit is the big problem.

I don't have a problem with avoiding the carry bit, it makes much more
sense to avoid it.

>     We could have the syscall layer return the result in memory and 
>     then return() the errno, but that adds four instructions (at least) 
>     to copy eax and edx into memory and then copy it back out again.

All of our system calls return int (exit doesn't really count). I don't
see why a system call should ever return a different type. That would
be one instruction to save eax and another instruction to update eax to
errno. Question is, can this be optimized away in the kernel / kernel interface
mapping?

>     We could have the syscall layer pop two stack frames instead of one
>     for a 'normal' return, then have the libc glue just fall through to
>     setting errno.  But I'm pretty sure that destroys the cpu's call
>     stack cache.

If we want to avoid copying arguments for the common cases, it would be
best to just pass the location of the first argument in. That wouldn't
add much overhead (if at all), since the kernel can use that address directly
instead of first calculating it. It would also help to avoid storage
problems, since the translation layer could put the arguments into a
separate page outside the user stack. This is important if we want to
use the same mechanism e.g. for Linux emulation.

Joerg