How much of microkernel?

Tue Aug 22 14:54:13 PDT 2006

On Tue, Aug 22, 2006 at 07:13:51PM +0000, Thomas E. Spanjaard wrote:
> Matthew Dillon wrote:
> >:I think L4 and Mungi have proven that doesn't have to be the case these 
> >:days.
> >    Well, I am not an expert on L4 or Mungi, but I can count cpu cycles, 
> >    and having to do a context switch eats a *lot* of computer cycles,
> >    and having to do a context switch which involves a change in the
> >    protection map eats even *more* computer cycles.  So many, in fact,
> >    that the overhead often exceeds the overhead of the operation one is
> >    trying to execute.
> 
> A relevant benchmark here is the lat_ctx benchmark of lmbench, of which 
> a comparison between Linux and L4 is given on [1]. Pretty impressive 
> results I'd say, even when the rate of context switches is higher.

Also, folks can use large pages to help with the mm load. That's universal
regardless if it's a microkernel or single image kernel. You still have
memory latency to get that at PTE, but that's life.  Microkernels tend to
oversimplify a lot of aspects of kernel programming which is why it's a
dangerous system.

There's no expression of memory latency aware synchronization techniques
like RCU, per CPU SLAB allocators, etc... all things that are need to address
performance problems as the system is tested in real life MP scenarios. They
are either largely incomplete, miss used or missing in the BSDs as a whole.

The power of dfBSD is that you have a person like Matt who's given a lot
of consideration to these things. Although it's MK like overall, the
expression of the system is not for the sake of an API but to support a
larger set of conceptual problem beyond the typical use of microkernels
to support cluster in a first class manner. The use and approach to the
problem space is very different than other *BSD SMP developments.

Messages aren't just messages, but atomically excuting entities in dfBSD.
When couple with tokens, MP reworking of the VFS layers flatten the
locking hierarchy, etc... along with the scheduler being overloaded in a
non-traditional manner to IPI processes for migrations... It permits for
a very coherent concurrency systems that's cleanly compartmentalized for
clustering. That's the power of Matt's vision. It's never the parts but the
system as whole that makes it interesting. :)

> >    One then winds up in a situation where one must hack the code to pieces
> >    to make it efficient... to reduce the number of context switches that
> >    occur.  For example, a number of people have advocated that the TCP 
> >    stack
> >    be moved to userland.  To my mind this is *NOT* micro-kernelish, as one
> >    then has no protection between the userland application and the 
> >    networking
> >    stack.  Shifting the work around without introducing new protection
> >    realms is NOT a microkernel architecture.  It offers no additional
> >    reliability or debuggability to the system, and makes the code such a
> >    huge mess that it becomes unmaintainable.
> 
> I don't agree with that either, but I do like a network stack as a 
> server in userland, but ofcourse in its own protection domain; Given 
> efficient sharing of memory between protection domains so you don't have 
> to copy data and a fast context switch, I think the little loss in 
> performance is outweighed by the advantages of live subsystem 
> upgrade/replacement, more defined protection between subsystems and the 
> other usual advantages of microkernel designs.

bill