lkwt in DragonFly

Tue Feb 10 22:11:14 PST 2004

-On [20040210 11:12], Miguel Mendez (flynn at xxxxxxxxxxxxxxxxxx) wrote:
>Jeroen Ruigrok/asmodai wrote:
>
>>I prefer to go with a hybrid method.  1:1 doesn't work.
>
>Could you elaborate on that? I agree that the prefered implementation is
>something KSE/SA alike, but Solaris is moving to 1:1 after years of 
>trying to make N:M work properly. The main problem I see with 1:1 
>threading is kernel memory usage on heavily threaded applications, but 
>other than that? Implementation is a lot simpler than in the N:M case. 
>The SA idea looks very good on paper until you start implementing and 
>see it's actually pretty hard work to get it functioning properly.

Kernel thread-based pthread implementation (1:1):

pros:

- threads compete against eachother for getting a piece of the
  schduler's time quanta, thus a single nice will only target one thread

- multiple threads in one program can run on different CPUs

cons:

- creation of a new kernel thread has overhead to know: call and kernel
  data structure maintenance overhead, for uniprocessor this is a waste
  you do not wish to have

- so using applications/programs which by default spawn over 10-100
  threads (Apache with its worker model?) you will notice
  system/performance degradation

User-space thread-based pthread implementation (N:1):

pros:

- everything in a library, thus quickly available for
  developers/customers, since no kernel sided changes need to be made

- no system calls needed, no context switching needed between threads, it
  allows some (note some) types of multithreaded applications to run
  faster than a kernel-thread implementation (uniprocessor and non-CPU
  intensive apps mostly)

- can be created quickly, no impact on the kernel: thus scaling well
  (since all threads in program A share process A's timequanta)

cons:

- all threads within a single process compete for a portion of the
  timequanta allocated to process A, nicing a thread will only make it
  consume more of the quanta allocated to process A and doesn't let it
  compete for more CPU with other processes unless you nice process A in
  its entirety (so no 'real-time' threads)

- no advantage of multiple CPU loading

two-level scheduler pthreads implementation (M:N):

user threads are mapped to kernel threads taken from a pool of kernel
threads, there is no relationship between the threads, in fact it can be
reallocated to another (free) kernel thread in future

you can dump threads that often wait on I/O, sleep on timers, are events
inside one kernel thread

for CPU heavy threads you can assign a 1:1 mapping to take advantage of
loading the CPUs with threads that will keep them busy (Digital Unix
actually detected changes in a thread's behaviour)

not all threads within a single process are bound to the process'
execution context, thereby allowing threads to spawn multiple CPUs

The internal complexity is the only thing against it.

-- 
Jeroen Ruigrok van der Werven <asmodai(at)wxs.nl> / asmodai / kita no mono
PGP fingerprint: 2D92 980E 45FE 2C28 9DB7  9D88 97E6 839B 2EAC 625B
http://www.tendra.org/   | http://diary.in-nomine.org/
Into each life some rain must fall, some days must be dark and dreary...