sockbuf (was Re: BGL-free net stack)
Aggelos Economopoulos
aoiko at cc.ece.ntua.gr
Mon May 5 10:43:31 PDT 2008
On Wednesday 16 April 2008, Aggelos Economopoulos wrote:
> Hello all,
>
> for my diploma thesis, I've arranged to work on getting the DragonFlyBSD
> network stack[*] to run without the BGL. Now, it would be great if the
> code changes would become part of the project. So, my plan is to study
> the net code over easter vacation (it's been a long while since I last
looked
> at it and I've never been a network guy in any case) and then post a
> preliminary roadmap on kernel at . At that point, everyone can give an opinion
> before any code gets written.
On second thought, let me ask for input sooner rather than later. I'm leaning
towards Matt's suggestion for a semi-lockless ring buffer for the sockbuf
http://leaf.dragonflybsd.org/mailarchive/kernel/2007-06/msg00122.html
One of the issues is that we can no longer give a stable character count for
the sockbuf, since data can arrive (if we're a reader) or be removed (if
we're a writer) at any time. But we can give some weaker guarantees: if we're
a reader, we can get a lower bound for cc and if we're a writer we have an
upper bound. Now, for references to the sendbuf in e.g. tcp_{in,out}put() I
think we can get away with taking a snapshot of the cc; the alternative would
be to try and make the code very very smart, but for now I'll settle for
obvious correctness. From a superficial first look, I think the handling of
so_oobmark will require some changes. At this point I'm inclined to start
going through all users and updating them to use the new sockbuf and see if
any real problems crop up. If anyone can see a fundamental problem or has a
better approach to suggest, please speak up now, so that I won't waste time
with a suboptimal/flawed approach.
I should also mention that I'm only interested in IPv4 TCP and UDP. The other
protocols can stay under the BGL for now.
Of course the sockbuf isn't the only issue I've busied myself with the past
couple of weeks, but it is one of the more interesting shared data
structures. Hopefully I'll get around to starting a discussion on inpcbs and
tcpcbs soon.
Aggelos
More information about the Kernel
mailing list