system semi-freezes on mbuf cluster limit
dillon at apollo.backplane.com
Wed May 23 09:32:01 PDT 2007
:I just experienced a nasty situation: I ran out of mbuf clusters (6656) =
:and ppp was one of the processes stuck in objcache_get.
:even after some clusters drained (from netstat -m output), the objcache d=
:epot didn't get free entries back and ppp stayed stuck. and of course be=
:cause of this no mbuf clusters were freed (ppp would have to transmit the=
:m, i guess). I was doing some serious down/uploading at the moment.
:this should not happen, or at least more gracefully.
I was debugging something similar earlier this month. Basically
what can happen is that if a machine is running a lot of simultanious
TCP connections, particularly outgoing connections which may build up
a lot of data in the socket buffers, the machine can hit its mbuf
Is that what is happening to you? Lots of outgoing tcp connections
with lots of data backed up (netstat -tn | fgrep tcp4)? I want to
make sure it isn't an mbuf leak.
When the cluster limit is reached, the sheer demand for packets
prevents the system from being able to recover mbufs. Eventually the
tcp connections start timing out and freeing all of their mbufs, and
the machine then recovers.
At the moment the only real solution is to increase the number of mbufs
as boot time (set kern.ipc.nmbclusters and kern.ipc.nmbufs in
One thing that would be nice would be to have some sort of algorithm,
similar to what linux has, where it detects the mbuf load on the system
and reduces the amount of data it allows the tcp connections to build
up dynamically, resulting in more graceful degradation.
<dillon at backplane.com>
More information about the Bugs