cvs commit: src/sys/kern uipc_socket.c uipc_socket2.c src/sys/sys socketvar.h

Matthew Dillon dillon at apollo.backplane.com
Sat Jul 23 00:51:05 PDT 2005


:
:Matthew Dillon wrote:
:>   Fix a sockbuf race.  Currently the m_free*() path can block, due to
:>   objcache_put() blocking when it must access the global depot.  This breaks
:>   the critical section *AND the BGL during a time when the sockbuf state is
:>   inconsistent.  Another process accessing the same sockbuf would then
:>   corrupt it.  Since depot access is fairly rare, this bug typically required
:>   a number of hours to reproduce.
:
:Would it work better if we would pass down flags to objcache_put, which 
:can indicate that blocking is bad, and then either spin with trytoken or 
:plainly destruct the object directly?
:
:cheers
:   simon

    There are several problems here.  First and foremost, the term
    'atomicy' has a relative meaning here.  Even the totally non-blocking
    free() path has to send an IPI message some times and this can result
    in the processing of incoming IPIS under certain circumstances, so
    we wind up walking a very fine line over what we mean by 'atomic operation'
    when we call m_free() or free().  IPI interactions are an issue even AFTER
    we've fixed the BGL and blocking issues.

    Jeff has suggested that mainline code (aka read()) use an IPI message 
    to obtain atomic access to the sockbuf interlocks against the protocol
    stack's access.  That would certainly work but I think it could seriously
    impact performance.  We might just have to redesign the sockbuf code,
    perhaps by giving it a fronting FIFO using tail-chasing indexes so the
    locking only occurs between competing read()'s rather then between a
    read() and the protocol stack (or a write() and the protocol stack).

    An option for the m_free*() path would be to guarentee that objcache_put
    is non-blocking at all times (I think passing a flag is a bit hokey, I'd
    rather just guarentee it).  Such a guarentee is possible to make if we
    do not access the underlying depot and instead just place the free mbuf
    on a per-cpu list which a helper thread moves to the depot, or something
    like that.  But even here there are side-issues in m_free() itself, in
    particular the fact that it has to free sidebar structures such as 
    m_tag's. 

    In anycase, it requires a lot of thought which is one of the reasons why
    the quick hack goes in now.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Commits mailing list