cvs commit: src/sys/netinet tcp_input.c

Matthew Dillon dillon at apollo.backplane.com
Wed Apr 18 20:38:36 PDT 2007


:On Tue, 17 Apr 2007 10:28:04 -0700 (PDT)
:Matthew Dillon <dillon at crater.dragonflybsd.org> wrote:
:
:> The possible trigger is
:> running netstat -an on a machine very heavily loaded with 6000+
:> network connections. 
:
:I ran netstat -an on a DF 1.8.1 proxy with a few hundred (1000 at most) connections and it did not crash, printed out all connections.
:
:-- 
:Gergo Szakal <bastyaelvtars at gmail.com>

    It is starting to make more sense.  I think what is happening is that
    a callout timer is getting held up long enough for the TCP state to
    change radically, due to the huge netstat -an, whos data is being loaded
    via a sysctl.  I committed a fix for one related problem to HEAD but I
    don't know if it is the one causing the crash.  Another possibility is
    that the callout code is not properly detecting when a callout gets
    ripped out from under it after blocking on the big giant lock.

    The larger the amount of information the sysctl has to load (i.e. the
    more connections the box has active), the longer it holds onto the big
    giant lock and the longer the callout gets stalled.

    The *real* fix for the problem is probably to have the callout queue
    a message to the TCP thread instead of issue a callback, which would
    allow the network callouts to run without the big giant lock.  That's
    fairly involved work and not something I can focus on right now.

						-Matt






More information about the Commits mailing list