patch #4 (Re: patch #3 (Re: The time has come for a kernel interfacing library layer))

Joerg Sonnenberger joerg at
Thu May 12 09:12:54 PDT 2005

On Thu, May 12, 2005 at 08:54:43AM -0700, Matthew Dillon wrote:
> :>     We could have the syscall layer return the result in memory and 
> :>     then return() the errno, but that adds four instructions (at least) 
> :>     to copy eax and edx into memory and then copy it back out again.
> :
> :All of our system calls return int (exit doesn't really count). I don't
> :see why a system call should ever return a different type. That would
> :be one instruction to save eax and another instruction to update eax to
> :errno. Question is, can this be optimized away in the kernel / kernel interface
> :mapping?
>     They actually return int64_t (%eax and %edx).  errno is placed in %eax
>     and the carry bit is set if an error occurs.

OK, so to evaluate the errno / return value pair, we just have to threat
them as int64_t and use mask operations? It would add one instruction
(test %eax,%eax or so) to recreate the carry bit. Looking at the code,
isn't it %eax for the return value and %edx for errno?

> :If we want to avoid copying arguments for the common cases, it would be
> :best to just pass the location of the first argument in. That wouldn't
> :add much overhead (if at all), since the kernel can use that address directly
> :instead of first calculating it. It would also help to avoid storage
>     If the address is passed in a register it would be roughly the same
>     overhead (the kernel just uses the user stack pointer plus some offset
>     to figure out the address of the arguments now).  Passing arguments
>     is not the problem, dealing with errno is the problem.

Passing arguments can be a problem if we want to introduce an intermediate
layer and do conversion on demand. sysctl and similiar interfaces come up.
If we can avoid the (second) argument copy in userland, it should help.

> :problems, since the translation layer could put the arguments into a
> :separate page outside the user stack. This is important if we want to
> :use the same mechanism e.g. for Linux emulation.
> :
> :Joerg
>      This seems rather over-complicated when we have a solution already
>      staring us in the face (just using the TCB to store errno).

But storing errno in TCB doesn't solve the problem of dealing with RTLD :-)
I'd still like to have one general mechanism which can be extended to deal
with foreign binaries too. Adding too many shortcuts makes that harder to
maintain too, even if the actual code is automatically generated.


More information about the Kernel mailing list