git: udp: Make udp pcbinfo and portinfo per-cpu; greatly improve performance

Sepherosa Ziehau sephe at crater.dragonflybsd.org
Sun Aug 31 01:08:16 PDT 2014


commit be4519a228f0cdc3d23bcbc147abcf2e7d27f4f7
Author: Sepherosa Ziehau <sephe at dragonflybsd.org>
Date:   Thu Jul 3 21:15:27 2014 +0800

    udp: Make udp pcbinfo and portinfo per-cpu; greatly improve performance
    
    MAJOR CHANGES:
    
    - Add token to protect pcbinfo's inpcb list and wildcard hash table.
      Currently only udp per-cpu pcbinfo sets this token.  udp serializer
      and netisr barrier are nuked.
    
      o  udp inpcb list:
    
         Under most cases, udp inpcb list is operated in its owner netisr.
         However, it is also accessed and modified (no effiective udp inpcb
         will be unlinked though) in netisr0 to adjust multicast options if
         one interface is to be detached.  So protecting udp inpcb list
         accessing and modification w/ token is necessary.
    
         At udp inpcb detach time, the udp inpcb is first removed from the
         udp inpcb list, then a message will go through all netisrs, which
         makes sure that no netisrs are using or can find this udp inpcb
         from the udp inpcb list.  After all these, this udp inpcb is
         destroyed in its owner netisr.
    
         In netisrs, it is MP safe to find a udp inpcb from udp inpcb list,
         then release the token and process the found udp inpcb.
    
         In other threads, it is MP safe to find a udp inpcb from udp inpcb
         list, then release the token and process the found udp inpcb in
         non-blocking fashion.
    
         See also the usage of inpcb marker.
    
      o  udp wildcard hash table:
    
         On input path, udp wildcard hash table is searched in its owner
         netisr.  In order to ease implicit binding (bind during send),
         connect after binding, and disconnect, udp inpcb are inserted
         into and removed from other udp pcbinfos' wildcard hash table in
         its owner netisr.  Thus the udp wildcard hash table must be
         protected w/ token.
    
         At udp inpcb detach time, a message will go through all netisrs,
         and this udp inpcb will be removed from the udp wildcard hash
         table belonging to the current netisr.  This makes sure that once
         the current netisr runs the message handler, this udp inpcb will
         not be used and be found in the current netisr.  When the message
         reaches the last netisr, this udp inpcb is redispatched to its
         owner netisr to be destroyed.
    
         In netisrs, it is MP safe to find a udp inpcb from udp wildcard
         hash table, then release the token and process the found udp inpcb,
         e.g. use udp inpcb found by in_pcblookuphash().
    
         In other threads, it is MP safe to find a udp inpcb from udp
         wildcard hash table, then release the token and process the found
         udp inpcb in non-blocking fashion.
    
         See also the usage of inpcb container marker.
    
      o  udp connect hash table:
    
         It is lockless MP safe, and only accessed and modified in its owner
         netisr.
    
    - During inpcb iteration through inpcb list, use inpcb marker when
      calling functions, which may block, e.g. in_pcbpurgeif0(), so the
      inpcb iteration will not stop prematurely, if the inpcb being
      processed is removed from the inpcb list.
    
    - Use udp inpcb wildcard table and udp inpcb connect hash table to
      dispatch input multicast and broadcast udp datagrams.  Using udp inpcb
      list could be time consume, since we need to check udp inpcb lists on
      all cpus; and secondly, once udp inpcb has a local port, it will be in
      either udp wildcard hash table or udp connect hash table.
    
      Since the socket buffer operation on input path may block, inpcb
      container marker is used when iterating inpcbs from udp inpcb wildcard
      hash table.  in_pcblookup_pkthash() is adjusted to skip inpcb
      container marker.
    
    - udp socket so_port is no longer fixed to netisr0 msgport
      o  Initial udp socket so_port is the current cpu's netisr msgport.
      o  Bound but unconnected udp socket so_port is selected according to
         local port hash.
      o  Connected udp socket so_port is selected according to the udp hash,
         i.e. laddr/faddr toeplitz hash (exception: multicast laddr or
         multicast faddr, is hashed to netisr0).
      o  Multicast socket options are forced to be handled in netisr0, since
         udp socket so_port may not be netisr0 msgport.
    
    - In order to support asynchronized udp inpcb detach:
      o  EJUSTRETURN from pru_detach method now means protocol will call
         sodiscard() and sofree() for soclose().  udp pru_detach method
         returns EJUSTRETURN as of this commit.
      o  SS_ISCLOSING socket state is set before calling pru_detach method,
         so protocol could avoid certain expensive, unnecessary or
         disallowed operation in pru_disconnect or pru_detach method, e.g.
         udp pru_disconnect method avoids putting udp inpcb back to udp
         wildcard hash table, if SS_ISCLOSING is set.
    
    MISC CHANGES:
    
    - pcbinfo's cpu id must be set now; -1 is disallowed.
    - udp pru_abort method should never be called; it panicks now.
    - Restore traditional BSD behaviour, if unbound udp socket connect
      fails: if local port of the udp socket has been selected, its inpcb
      should be in wildcard hash table, i.e. the udp inpcb should be visible
      on udp datagrams input path.
    - Make sure multicast stuffs are adjusted only in netisr0 for inet6, if
      one interface is about to be detached.
    
    PERFORMANCE IMPROVEMENT:
    
    For 'kq_connect_client -u' test, this commit gives 400% performance
    improvement (31Kconns/s -> 160Kconns/s).

Summary of changes:
 sys/kern/uipc_msg.c         |   3 +-
 sys/kern/uipc_socket.c      |  39 ++-
 sys/net/ipfw/ip_fw2.c       |   8 +-
 sys/net/netmsg.h            |   1 +
 sys/net/pf/pf.c             |   2 +-
 sys/netinet/in.c            |   6 +-
 sys/netinet/in_pcb.c        | 410 +++++++++++++++++--------
 sys/netinet/in_pcb.h        |  44 ++-
 sys/netinet/in_proto.c      |  11 +-
 sys/netinet/ip_demux.c      |  26 +-
 sys/netinet/ip_divert.c     |   6 +-
 sys/netinet/ip_output.c     |  19 ++
 sys/netinet/raw_ip.c        |   6 +-
 sys/netinet/tcp_subr.c      |  16 +-
 sys/netinet/udp_usrreq.c    | 731 +++++++++++++++++++++++++++++---------------
 sys/netinet/udp_var.h       |  10 +-
 sys/netinet6/in6_ifattach.c |  44 ++-
 sys/netinet6/in6_pcb.c      |  81 ++++-
 sys/netinet6/in6_pcb.h      |   4 +-
 sys/netinet6/ipsec.c        |   2 +-
 sys/netinet6/raw_ip6.c      |   2 +-
 sys/netinet6/udp6_usrreq.c  |  44 ++-
 sys/sys/protosw.h           |   4 +
 sys/sys/socketops.h         |   2 +-
 sys/sys/socketvar.h         |   2 +
 25 files changed, 1029 insertions(+), 494 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/be4519a228f0cdc3d23bcbc147abcf2e7d27f4f7


-- 
DragonFly BSD source repository


More information about the Commits mailing list