More syscall messaging commits, and some testing code as well.

Matthew Dillon dillon at
Tue Aug 12 12:24:18 PDT 2003

:>     If we assume the kernel system call messaging overhead is around
:>     256 bytes per message, then supporting 100000 concurrent system
:>     calls would only eat around 25MB of ram.   In otherwords, no drastic
:>     'solution' to the problem is needed, just something that prevents the
:>     system and the programs running on the system from destabilizing if
:>     they happen to hit the limit.
:So the odds of that happening are very slim, so we optimize for the happy
:path.  On those exception cases we take the hit with a poll from the
:userland scheduler.  Once the userland scheduler is notified that the
:limit has been reached, could we instead have a upcall to the userland
:scheduler when the queue length drops below some water mark, until
:that upcall happens the userland scheduler won't make any systems calls?

    Hysteresis would definitely be desireable.  The userland scheduler cannot
    just stop, though, because that could deadlock the program.   It would
    have to revert to polling until the concurrent syscall count drops below
    the hysteresis point.

    For example, consider a simplified case with a web server.  It could
    theoretically use up all of its syscall messages queueing read() 
    requests from incoming connections and then stop issuing syscalls,
    and those connections might themselves be waiting for pipelined
    output from a previous command which the web server never sends because
    it has hit its syscall messaging limit.  Deadlock.  So the fallback
    has to be to go to a polling scheme (at the cost of spinning cpu cycles)
    rather then 'waiting' for some of the syscall messages to complete and

    The best test for this sort of limiting feature is to set the resource
    very low, like to '2' concurrent syscalls per (heavy weight) process.
    If the threaded application still works (albeit inefficiently), then
    the fallback has been implemented properly.  You definitely do not
    want to have a fallback/blockage algorithm that fails to work with
    low resource settings.


More information about the Kernel mailing list