sockbuf (was Re: BGL-free net stack)

Aggelos Economopoulos aoiko at cc.ece.ntua.gr
Mon May 5 10:43:31 PDT 2008


On Wednesday 16 April 2008, Aggelos Economopoulos wrote:
> Hello all,
> 
> for my diploma thesis, I've arranged to work on getting the DragonFlyBSD
> network stack[*] to run without the BGL. Now, it would be great if the
> code changes would become part of the project. So, my plan is to study
> the net code over easter vacation (it's been a long while since I last 
looked 
> at it and I've never been a network guy in any case) and then post a 
> preliminary roadmap on kernel at . At that point, everyone can give an opinion 
> before any code gets written.

On second thought, let me ask for input sooner rather than later. I'm leaning 
towards Matt's suggestion for a semi-lockless ring buffer for the sockbuf

http://leaf.dragonflybsd.org/mailarchive/kernel/2007-06/msg00122.html

One of the issues is that we can no longer give a stable character count for 
the sockbuf, since data can arrive (if we're a reader) or be removed (if 
we're a writer) at any time. But we can give some weaker guarantees: if we're 
a reader, we can get a lower bound for cc and if we're a writer we have an 
upper bound. Now, for references to the sendbuf in e.g. tcp_{in,out}put() I 
think we can get away with taking a snapshot of the cc; the alternative would 
be to try and make the code very very smart, but for now I'll settle for 
obvious correctness. From a superficial first look, I think the handling of 
so_oobmark will require some changes. At this point I'm inclined to start 
going through all users and updating them to use the new sockbuf and see if 
any real problems crop up. If anyone can see a fundamental problem or has a 
better approach to suggest, please speak up now, so that I won't waste time 
with a suboptimal/flawed approach.

I should also mention that I'm only interested in IPv4 TCP and UDP. The other 
protocols can stay under the BGL for now.

Of course the sockbuf isn't the only issue I've busied myself with the past 
couple of weeks, but it is one of the more interesting shared data 
structures. Hopefully I'll get around to starting a discussion on inpcbs and 
tcpcbs soon.

Aggelos





More information about the Kernel mailing list