You could do worse than Mach ports

Matthew Dillon dillon at apollo.backplane.com
Fri Jul 18 12:03:41 PDT 2003


:>     If the user process is manipulating a *LOT* of data then a 
:> double-buffered
:>     approach might be the way to go, where the user process always has 
:> access
:>     to a kernel-providing writable buffer and when it stages it into 
:> the
:>     kernel the kernel replaces the user VM page with a new buffer 
:> (making
:>     the old one now in the sole domain of the kernel).
:
:So the kernel would swap out the buffer that was previously owned in 
:user space
:to its own ownership and replace that buffer with something of equal 
:size..
:
:Are you saying that the puts and the gets would go the kernel buffer 
:only...
:until the process that caused the buffer-swap tells the kernel it wants 
:its
:buffer back.

    I do not think that there is any one 'correct' way to do it.  There
    are lots of possibilities.

    For example, you could use two FIFOs, like this:


			+---+---+---+
		 <====	| 1 | 2 | 3 | <====
			+---+---+---+
    PROCESS
			+---+---+---+
		 ====>	| 6 | 5 | 4 | ====>
			+---+---+---+

    Where the PROCESS would be able to R+W the pages but the kernel would
    pipeline them into and out of the kerenl.  So, take #4 for example.
    The user process builds the page and sends it off to the kernel.  The
    kernel replaces page #4 in the user process address space with a
    fresh page (and in the mean time the user process might be building
    page #5).  The original #4 is now solely in the domain of the kernel,
    which processes it, etc...  to return the page to the user process the
    kernel maps it to the process's incoming FIFO (1, 2, 3), e.g.:

			+---+---+---+
		 <====	| 4 | 2 | 3 | <====
			+---+---+---+
			    ^
			    user process has finished processing page 1 and
			    the kernel knows it is available for the next
			    FIFO slot, so the kernel replaces it with page 4.

    In this sort of scheme the user process can maintain a pipeline to and
    from the kernel.  Everything mapped to the user address space is 
    read-write, but the kernel physically steals the pages (thus making
    them unavailable to the user process) for the duration of the 
    dispatched operation.

    I am not advocating this type of scheme for DragonFly because it is
    *very* expensive and has a high kernel virtual memory (KVM) cost,
    but in a dedicated embedded system like a router that is processing
    massive amounts of data where you also want some sort of MMU based
    protection, this scheme could be made very efficient.

:Well I hadn't thought it out that deeply... but MPI doesn't say the user
:"can't" write to the pages during an access epoch.  The standard does 
:say
:that if you do you completely invalidate all guarantees for the 
:consistency
:of the buffer.  I would be comfortable with that in an IPC system for 
:local
:processes as well.

   Ah.  When they say that what they mean is that they are not bothering
   to actually write-protect the page (which can be expensive in a high
   volume environment) but instead they are depending on you to honor the
   read-only/read-write API specification and not mess with the data at
   the wrong time.

:>     exceptional circumstances that are outside the critical path, such 
:> as
:>     in critical low memory situations.
:
:But if a port agent blocks perhaps the whole process isn't active 
:anymore...
:to the user non-blocking appearance could be maintained. [as long as 
:one doesn't
:hang the kernel trying to achieve it]
:
:Thanks again Matt,
:
:Dave

    Yes, the actual behavior verses the perceived behavior are two different
    things.  The appropriate internalized (within the kernel) behavior
    would depend on the circumstances.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Kernel mailing list