UDP inpcbs and more (was Re: tcpcb etc)

Matthew Dillon dillon at apollo.backplane.com
Fri Jun 6 12:32:51 PDT 2008


:First, UDP sockets can issue connect() multiple times (thus changing
:faddr, fport) and call bind to go from wildcard laddr to a specific one.
:When this happens, the inpcb must be moved to another cpu. This shouldn't
:be too hard to handle; just mark the old inpcb as BEING_DELETED and only
:delete it after inserting the new inpcb. Dropping a few packets is expected
:for UDP and this shouldn't happen very often anyway.
:....
:
:Then there is the more interesting issue of how to hash. As described above,
:the lport is the only field we can be sure is not a wildcard. Now consider
:a UDP (say DNS) server; such a server does not normally connect() so whatever
:hash function we choose, the inpcb is going to end up on one cpu. This is the
:cpu we would normally dispatch an incoming UDP packet to. The thing is, all
:datagrams for our UDP server will end up going through the same cpu. So our
:busy DNS server just can't scale: using only one protocol thread is going to
:be a bottleneck.

    My personal opinion is that we should just hash on laddr/lport and not
    worry about the very few applications that try to demux packets with
    multiple threads from the same socket.  At least not for now.

:option changing under it. However, our sockbuf can't handle concurrent
:accesses, so we'd have to have multiple sockbufs (one per cpu) and then
:the socket layer would have to pull data from all of them (probably in a
:round-robin fashion). UDP does not guarantee in-order delivery but, since
:in-order is typically the case, I'm not sure how well the apps can handle
:it. On top of that we'd need to decide what to do about buffer size limits
:and whether the sockbufs should stay in struct socket.
:
:OK, this should get the discussion started :)
:
:Aggelos

    Our sockbufs need a general SMP solution, I think probably a spinlock
    may be best due to the concurrency.

    I'd say we should get it working first, and then worry about optimizing
    it.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Kernel mailing list