em driver - issue #2

Matthew Dillon dillon at apollo.backplane.com
Sun Feb 6 12:11:14 PST 2005


:I think there are a couple of things wrong with that solution.
:First, controllers know what to do with empty descriptors, in that they fall into a RNR condition. Thats part of the basic design. Its the drivers responsibility to clear up such conditions. At 145Kpps, you're not going to achieve much by trying to fool the driver into thinking that it has memory, except losing a lot of packets. The point of the RNR condition is to get the other end to stop sending until you can handle it. The driver is doing something wrong in this case, and it needs to be cl
eaned up properly.
:
:The second thing thats wrong is that the "problem" is that the memory MUST be available. That has to be corrected. Its not acceptable for it to fail the way its failing. There's no excuse for a system with 20K clusters supposedly allocated to not be able to get the 1600th cluster because of a "bucket" problem. The reason that many drivers don't handle the "cant get memory" condition is because it almost never happens in real world scenarios. Its a serious problem that it happens so quickly. 10
00 packets at gigabit speeds is a tiny amount of time. It makes little sense to redesign the mbuf system only to leave it with such an inefficiency. I don't know enough about it to know how other O/Ss do it, but they don't fail the way the dfly does in this instance.

    Well, we haven't resolved why the memory allocation is failing.  You
    need to do a vmstat -m to see the real memory use.  A machine which has
    say a gigabyte of ram will allow the mbuf subsystem to allocate
    ~100 MBytes by default.

    In the current design the processing of the input packet is decoupled
    from the network interrupt.  This means that the machine can potentially
    handle a 145Kpps rate at the interrupt layer but still not have sufficient
    cpu to actually process packets at that rate.  The packets are almost
    certainly backing up on the message port to the protocol threads.

    So there's a tradeoff here...  we can virtually guarentee that memory
    will be available if we flow control the interface or start to drop
    packets early, or we can allow the interrupt to queue packets at the
    maximum rate and allow memory allocations to fail if the memory limit
    is reached.  But we can't do both.  If you stuff in more packets then
    the cpu can handle and don't flow control it, the machine will hit its
    allocation limit no matter how much memory is reserved for the network.

    Perhaps what is needed here is some sort of feedback so the network
    interrupt can be told that the system is running low on mbufs and flow 
    control itself before we actually run out.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Users mailing list