New brainfart for threaded VFS and data passing between threads.

Bill Huey (hui) billh at gnuppy.monkey.org
Wed Mar 31 02:29:30 PST 2004


On Tue, Mar 30, 2004 at 02:58:13PM -0800, Matthew Dillon wrote:
>     * We can use the new XIO interface for all block data referencse from
>       userland and get rid of the whole UIO_USERSPACE / UIO_SYSSPACE mess.
>       (I'm gunning to get rid of UIO entirely, in fact).
> 
>     * We can use the new XIO interface for the entire I/O path all the way
>       down to busdma, yet still retain the option to map the data if/when
>       we need to.  I never liked the BIO code in FreeBSD-5, this new XIO
>       concept is far superior and will solve the problem neatly in DragonFly.

This is going to help tremendously with the userspace VFS stuff you're doing in
the future. It also practically eliminates any fear from copying overhead and 
programming complexity that folks might have across subsystems with a single
clean API. It's a big complexity win overall if I'm understanding you correctly.

>     * We can eventually use XIO and SF_BUF's to codify copy-on-write at
>       the vm_page_t level and no longer stall memory modifications to I/O
>       buffers during I/O writes.
> 
>     * I will be able to use XIO for our message passing IPC (our CAPS code),
>       making it much, much faster then it currently is.  I may do that as
>       a second step to prove-out the first step (which is for me to create
>       the XIO API).

More of the same. It's akin to the late-binding style in dynamic message
passing programming languages. By pushing things late, at runtime, you can
simplify the logic, structures and cross component relationships that normally
must be defined clearly at compile time, by packing things up so that it can
be sent to something. By doing it late, you can eliminate a lot of this
"packing/packaging". I know this is a bit abstract, and my terms are from
another displine, but that's how I view some of this. Please correct me if
I'm wrong.

>     * Once we have vm_page_t copy-on-write we can recode zero-copy TCP 
>       to use XIO, and won't be a hack any more.
> 
>     * XIO fits perfectly into the eventual pie-in-the-sky goal of
>       implementing SSI/Clustering, because it means we can pass data
>       references (vm_page_t equivalents) between machines instead of 
>       passing the data itself, and only actually copy the data across
>       on the final target.  e.g. if on an SSI system you were to do
>       'cp file1 file2', and both file1 and file2 are on the same filesystem,
>       the actual *data* transfer might only occur on the machine housing
>       the physical filesystem and not on the machine doing the 'cp'.  Not
>       one byte.  Can you imagine how fast that would be?

>     And many other things.  XIO is the nutcracker, and the nut is virtually
>     all the remaining big-ticket items we need to cover DragonFly.

Yep, it localizes all your kernel operations to the local processor, fitting
within your cpu ownerships token stuff perfectly, and gives a clean, coherent
API for pervasive kernel component communication both locally and remotely.
It's a really neat idea with many "secondary effects" (regarding kernel
structures). I hope it works out for you in your implementation. :) All of
this is pretty abstract stuff.

I'm done now. :)

bill






More information about the Kernel mailing list