Backporting DFly patches to FreeBSD?
Matthew Dillon
dillon at apollo.backplane.com
Sun Feb 20 13:17:20 PST 2005
:I really don't think that's the way the cookie crumbles, as you state
:it. FreeBSD has very different design goals and I believe the project
:will stick to them. Again, as some developers have said, they would like
:to see some of these changes stabilize before importing them into
:FreeBSD.
:
:Plenty of people are interested to get some of the DragonFly BSD work
:into FreeBSD. Sascha's syscons changes are especially interesting for
:the FreeBSD project as they would be quite simple to import compared to
:some of our other advancements. I don't think that interest is a lacked
:factor in this equation; I think that developer time is. While the
:project has a relatively large number of developers, the number of these
:who actively contribute and understand the parts of the kernel that are
:affected is limited to the few who are already busy with several other
:projects.
:
:I would personally like to see some well-defined scalability tests
:(perhaps in-kernel hooks would be interesting to examine performance of
:some of our more obscure modifications) before I see any more stuff
:stating that we perform better than FreeBSD on the same hardware.
:
:I'd also like to see some people work on backporting changes to FreeBSD
:and less fingers pointed towards the developers who are actively working
:on the FreeBSD project. These are people who have families, jobs and do
:already actively contribute. It's unfair to ask these people to drop
:what they're doing and import DragonFly's changes. It's also
:impractical: one reason that DragonFly exists is to explore how these
:changes actually work in practice.
:
:I could generate pages of comparisons that would make it quite clear
:that these aren't always good ideas nor reasonable expectations, but I
:think that's clear by now.
:
:Kind regards,
:
:Devon H. O'Dell
I have to say that I disagree with this assessment of FreeBSD, and I
disagree with the assessment as to why some of the work hasn't been
backported. First of all, the stability argument can't be right.
FreeBSD developers are committing all sorts of things to FreeBSD-5/6
that are not only unstable, but *seriously* unstable and some
of those issues, like the scheduler, have taken a year or more to fix
(if indeed one could call it fixed now, which I doubt). DragonFly's
development isn't bug free, but the regimen is a lot tighter then FreeBSD
development these days. Most of the mechanisms that are still
backportable have been stable in DragonFly for over a year. For most
of the rest FreeBSD has simply waited too long and the divergence has
become too great for any single person, even a dedicated one, to backport
the more interesting work. So, for example, IPI messaging should be
backported, no question about it, and it still can be. It's an absolute
requirement for them to backport it and consolidate all the myrid
ridiculously complex IPI mechanisms they currently have, but nobody is
doing it. I consider that a serious management failure on FreeBSD's part.
There is certainly interest in doing some backporting, and for those
developers not being able to do it is nothing more then a time constraint,
but there is also a level of hostility to any backported work and a
level scrutiny that goes way beyond the scrutiny applied to native work.
Generally lots of 'its not proven' excuses go flying around pretty much
ignoring the fact that similar FreeBSD development itself is just as
unproven. In many respects, I think the perforce model they are using
has resulted in even more isolation of sub projects within FreeBSD, and
it hasn't seemed to helped in the bug department for work that finally
gets into the CVS tree.
Personally speaking, I think we've proven our model in all aspects
except performance. What we are doing is clearly far more maintainable.
FreeBSD has a bit of a hidden beast problem in the maintainability
department. When an original author takes a vacation or stops working
on something, the maintainability problem hits them squarely in the face,
but its hidden as long as the original authors continue working on the
code. That doesn't bode well for the future. On the otherhand, there
have been half a dozen instances where people have come in cold and
done bug-free or mostly bug-free (meaning fixed in a day or two)
work on core pieces of the DragonFly code base. Without any prior
instruction Jeffrey Hsu was able to thread and message most of the network
protocol stack using my LWKT messaging primitives. Joerg was able to do
major cleanups of the namecache code with only one or two emails
between us. David Xu sent me a TLS patch out of the cold that adds
code to the core LWKT switching assembly without any instruction.
Richard Nyberg debugged a namecache issue out of the cold. I consider
these and other events as undeniably proving the maintainability of
our codebase.
On the performance front... I absolutely refuse to rush into removing
the BGL. I don't give a damn what excuses the FreeBSD people are making
with regards to unproven performance. They are flapping their mouths
a lot about unproven this and unproven that, but they aren't actually
thinking theory and that is a serious mistake. I want to get our
codebase using mostly MP clean algorithms BEFORE I start actually
turning off the big giant lock. So e.g. the threaded network protocol
stacks are using MP clean algorithms, but they aren't 100% MP clean
yet and so the BGL is still turned on. We *KNOW* that those pieces
which are now MP clean, stable, and well tested, are not likely going
to be the cause of any bugs when we start to deal with the remainder
down the line. That's what is important. Not only that, but since
it is a threaded subsystem we have an ability to turn off the BGL on
a thread-by-thread basis, and even do it on the fly with sysctls,
which is a far saner development model then FreeBSD's 'oh lets throw
mutexes around all this junk, turn off the BGL, and pray' methodology.
There is definitely some UP performance degredation from e.g. threading
the network protocol stack, as one person's recent routing tests have
shown. But considering the fact that we haven't actually tried to
*optimize* the messaging yet the numbers are pretty much in-line with
what I would expect. More to the point, the *theory* behind getting good
performance out of a threaded subsystem is sound, primarily the
ability to queue more then one piece of work before switching threads,
and I see such a clear path for optmization and improvement (without
having to resort to 'hacks') that I am confident that we will be able to
soundly thrash the mutex model when all is said and done. Certainly,
no matter what, we will come close, and that would be a win too
considering the vast differences in maintainability between our code
and FreeBSD's.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Users
mailing list