(userspace) vfs and xio [CAPS?]
Dave Leimbach
leimySPAM2k at mac.com
Tue May 25 12:20:13 PDT 2004
Matthew Dillon <dillon at xxxxxxxxxxxxxxxxxxxx> writes:
> :This brings me back to wondering about CAPS and it's 128K single message limits.
> :
> :Is it expected that the user of CAPS would manually break his/her messages into
> :<= 128K chunks and then send?
> :
> :
> :
> :>
> :> But because cross-address-space access are so expensive, it will
> :> probably be more efficient to break a large UIO into an XIO loop
> :> and take the thread switching hit in the loop rather then to try to pull
> :> out a portion of a foreign address space. Besides, physical I/O
> :> is already limited to 128KB/256KB chunks so the extra thread switches
> :> will not be that big an issue.
> :>
> :
> :Well yeah... like that for CAPS too? Or will CAPS include this loop in it's
> :implementation?
> :
> :Seems like a concern that CAPS users shouldn't have to deal with unless it's
> :terrbly inefficient to implement the loop in CAPS.
> :
> :Dave
>
> I agree completely. That limit is temporary... it was the easiest way
> to rip out the old cross-address-space junk and use XIO instead. The
> code needs another pass to add the transfer loop (which is really just
> another one or two states for the message).
>
In Portals [an RDMA style "put/get" message passing system] you get "start"
and "finish" events on an event queue object when a put or get starts or
finishes. I think that would work nicely here as well.
We don't really have a "queue" handle per-se as portals does but we do have
the CAPS id which could be viewed that way I suppose.
I wonder if it would be possible to get cid's into kqueue notification?
I believe the Portals API even specifies some non-contiguous message
passing via some iovec like implementation in it's latest specification.
For interested parties.
http://www.sandiaportals.org
I've used this API extensively in the last 3 years and it's not too shabby :).
It's being used as the underlying layer of the Lustre cluster file system as
well and is implemented in kernel space, IIRC, in linux on top of a Network
Abstraction Layer.
Pretty neat stuff. It was designed with scalability in mind as you don't
require a socket to address other processes... you just have to find it's
pid and nid [process id and node id] to talk to it. In other words you don't
need file descriptors for sockets to do communication.
I wouldn't mind taking a few passes at the CAPS stuff perhaps sometime this
week and see what I come up with. I am relocating all next week but will
hopefully have my stuff back from the moving company within 8 days and would
be able to resume.
Dave
More information about the Kernel
mailing list