Call for Developers! Userland threading

Julian Elischer julian at elischer.org
Tue Jul 22 15:33:11 PDT 2003



Matthew Dillon wrote:
    Ok, this is an official call for developers to begin working on userland
    threading.  I've come up with a timetable and infrastructure that
    should be sufficient for those developers interested in the work to 
    actually begin working!  I would like to hand-out some commit bits for
    those doing the work, and I would like to find someone to head up this
    sub-project (i.e. not me, I will be focusing on the kernel-side support).

    I was kinda thinking of Peter da Silva or Jeffrey Hsu to lead this effort,
    but I don't know what kind of time commitment people have.
    Here's the infrastructure idea I've come up with:

	* We throw away libc_r.  Er, that is, we keep it around but base all
	  new development on a copy of the original libc.  We would call
	  it, oh, libcr (as a pun on libc_r).  It wouldn't be an 'alias' 
	  of libc like libc_r is, it would be an actual physical copy of
	  libc.

	  When all is said and done, several months from now, the new libcr
	  will *become* our libc, i.e. it will be responsible for both 
	  non-threaded and threaded programs.  Don't worry about non-threaded
	  overhead, it won't be that big a deal because LWKTs can be made
	  quite optimal in non-threaded environments.

	* We (temporarily) throw away POSIX compatibility.  I believe that
	  the userland threading implementation should be based around LWKTs
	  and LWKT messaging - i.e. a direct port of the LWKT modules now
	  in the kernel.  The problem with trying to maintain the POSIX
	  infrastructure is that the signal handling will bog down the
	  development.  I believe the signal handling can be dealt with with
	  supporting kernel infrastructure that does not yet exist.  So for
	  now we throw away POSIX.  Later on we will re-implement it.
I think you are making a mistake here..
your conclusion that what is being done in freeBSD 5 is not compatible with 
what you are doing here is wrong. You yourself said that async syscals with 
upcalls would be easy to do.

"kse" (misnamed) threading uses two 'orthogonal' aproaches.
1/ All syscalls can be async through callbacks. this allows even a single 
cpu machine to run multiple threads. I believe this approach is still 
optimal for allowing mor ethan one thread to run on a single cpu.

2/ adding virtual cpus to the 'group' allows more than one execution unit to 
be used by the Userland scheduler at the same time.
This is basically what you call "rfork" except that rather than
making sub-structures such as file descriptor tables be shared by multiple
'processes' as in the rfork method, there is a single structure (the proc 
struct) that acts as a rendesvous point for all threads in teh same process.

The 'rfork' method is in the end more complicated than to have a single 
process structure. For example, you need to store the 'pid' somewhere,
but if you rfork multiple proc structs then you have to have a hierarchy set 
up to find the correct pid to return. Then when you decide to start removing
process structures, you have to make sure that the hierarchy is correctl 
contracted.



	  Direct LWKT port but: maybe rename struct thread to struct
	  userthread?
	  I will happily provide the userland assembly bits for I386 for
	  the initial entry and switching functions for LWKTs.  They're
	  really easy to do, basically just pushal, stack switch, popal, ret.
	* We (temporarily) build the new system call emulation layer into
	  libcr with an eye towards eventually separating it out into its own
	  independant library.
	  This new layer is very simple in concept.  Basically you will begin
	  implementing system calls which convert to messages.  For example,
	  in libcr read() would be:
	  ssize_t
	  read(int fd, void *buf, size_t nbytes)
	  {
	      syscall_any_msg_t msg;
	      int error;
	      /*
	       * Use the convenient mostly pre-built message stored in the
	       * userthread structure
	       */
	      msg = &curthread->td_sysmsg;
	      msg->fd = fd;
	      msg->buf = buf;
	      msg->nbytes = bytes;
	      error = lwkt_domsg(&syscall_port, msg);
	      curthread->td_errno = error;
	      if (error)
		  msg->result = -1;
	      return(msg->result);
	  }
this is a synchronous syscall. Async syscalls are much more interesting..

	  The actual int 0x80 would be done by syscall_port's beginmsg
	  function (it would point to a bit of assembly).  And, yes, that
	  means you can theoretically shim the syscall port if you want
	  (mantra fodder: flexibility!), and it also means that errno
	  handling is done in userland (more mantra fodder: flexibility!).
    I think this would be a great project for developers to really sink their
    teeth into, because there is so much to do it can be worked on by 
    several people in parallel, and because the breakages will not effect
    the stability of the development environment, and for all the reasons
    above it means I can start handing out commit bits. 

    I would like to find one developer to act as the head-honcho for the
    userland work, and any number of developers to work on the pieces.  The
    piecemeal work is:
	* Messaging for individual syscalls (i.e. each system call, like
	  read() above, needs to be coded for the messaging interface).
	* The LWKT threading port (I can help with the assembly bits).

	* Implementation of the per-cpu-area abstraction (becomes per-rfork).
why throw away evertything that has been learned so far?

	* (later on) Use %fs or %gs (kernel-supported?) to aid in access 
	  to per-cpu areas?  Anyone have any ideas here?  It isn't necessary
	  for the initial threading work.
It MUST be %gs on x86.. the ELF TLS spec mandates it.
It must also describe an LDT entry that points to a POINTER to the thread 
descriptor block..
i.e. movl gs:0, %eax
loads the address of the thread control block into %eax

For OTHER architectures the thread pointer is a normal pointer register and 
points directly to the Thread control block.

	* (later on) Thread migration between rforks (i.e. more sophisticated
	  scheduling actions).
your decision top rfork multiple process structures is I believe misguided.
You have the oportunity to change everything.. why stick with old thinking?

	* (later on) development of a kernel-supported signal infrastructure
	  for proper POSIX signal handling.
Check what david Xu has done for posix thread support.
We treat the upcalls as an upgoing message interface.. should just get 
simpler if moved to your system.


    I should have a basic syscall messaging syscall working this week,
    even though it will initially operate synchronously.
						-Matt







More information about the Kernel mailing list