UDP inpcbs and more (was Re: tcpcb etc)
Matthew Dillon
dillon at apollo.backplane.com
Fri Jun 6 12:32:51 PDT 2008
:First, UDP sockets can issue connect() multiple times (thus changing
:faddr, fport) and call bind to go from wildcard laddr to a specific one.
:When this happens, the inpcb must be moved to another cpu. This shouldn't
:be too hard to handle; just mark the old inpcb as BEING_DELETED and only
:delete it after inserting the new inpcb. Dropping a few packets is expected
:for UDP and this shouldn't happen very often anyway.
:....
:
:Then there is the more interesting issue of how to hash. As described above,
:the lport is the only field we can be sure is not a wildcard. Now consider
:a UDP (say DNS) server; such a server does not normally connect() so whatever
:hash function we choose, the inpcb is going to end up on one cpu. This is the
:cpu we would normally dispatch an incoming UDP packet to. The thing is, all
:datagrams for our UDP server will end up going through the same cpu. So our
:busy DNS server just can't scale: using only one protocol thread is going to
:be a bottleneck.
My personal opinion is that we should just hash on laddr/lport and not
worry about the very few applications that try to demux packets with
multiple threads from the same socket. At least not for now.
:option changing under it. However, our sockbuf can't handle concurrent
:accesses, so we'd have to have multiple sockbufs (one per cpu) and then
:the socket layer would have to pull data from all of them (probably in a
:round-robin fashion). UDP does not guarantee in-order delivery but, since
:in-order is typically the case, I'm not sure how well the apps can handle
:it. On top of that we'd need to decide what to do about buffer size limits
:and whether the sockbufs should stay in struct socket.
:
:OK, this should get the discussion started :)
:
:Aggelos
Our sockbufs need a general SMP solution, I think probably a spinlock
may be best due to the concurrency.
I'd say we should get it working first, and then worry about optimizing
it.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list