DragonFlyBSD Thread on osnews
Jonathan Weeks
jbweeks
Fri Feb 2 16:35:58 PST 2007
FYI -- there was a DragonFlyBSD 1.8 announcement on osnews, with a
thread discussing Linux scalability vs DragonFlyBSD, which might bear
an educated response:
http://www.osnews.com/comment.php?news_id=17114&offset=30&rows=34&threshold=-1
I admit I'm not the most experienced kernel programmer in the world,
but I have a few years of Linux and AIX kernel programming experience.
Maybe you are more qualified, I don't know.
You say Linux scales up to 2048 CPUs, but on what kind of system?
The top end of the SGI Altix line of Linux supercomputers runs 4096
CPUs, and IBM validated Linux on a 2048-CPU System P. Linux scales to
1024 CPUs without any serious lock contention. At 2048 it shows some
contention for root and /usr inode locks, but no serious performance
impact. Directory traversal will be the first to suffer as we move
toward 4096 CPUs and higher, so that's where the current work is
focused.
Is this the same kernel I get on RHEL. Can I use this same kernel on a
4 CPU systemm? What Linux version allows you to mix any amount of
computers with whatever amount of cpus and treats them all as one
logical computer while being able to scale linearly?
Choose the latest SMP kernel image from Red Hat. The feature that
allows this massive scaling is called scheduler domains, introduced by
Nick Piggin at version 2.6.7 (in 2004). There is no special kernel
config flag or recompilation required to activate this feature, but
there are some tunables you need to set (via a userspace interface) to
reflect the topology of your supercomputer (i.e. grouping CPUs in a
tree of domains).
Usually massive supercomputers are installed, configured, and tuned by
the vendor. They'd probably compile a custom kernel instead of using
the default RHEL image. But it could work out of the box if you really
wanted it to.
. ..rather than rely on locking, spinning, threading processes to
infinity, it will assign processes to cpus and then allow the processes
to communicate to each other through messages.
That's fine. It's just that nobody has proven that message passing is
more efficient than fine-grained locking. It's my understanding
(correct me if I'm wrong) that DF requires that, in order to modify the
hardware page table, a process must to send a message to all other CPUs
and block waiting for responses from all of them. In addition, an
interrupted process is guaranteed to resume on the same processor after
return from interrupt even if the interrupt modified the local runqueue.
The result is that minor page faults (page is resident in memory but
not in the hardware page table) become blocking operations. Plus, you
have interrupts returning to threads that have become blocked by the
interrupt (and must immediately yield), and the latency for waking up
the highest priority thread on a CPU can be as high as one whole
timeslice.
DF has serialization resources, but they are called tokens instead of
locks. I'm not quite sure what the difference is. There also seems to
be a highly-touted locking system that allows multiple writers to write
to different parts of a file, which is interesting because Linux,
FreeBSD, and even SVR4 have extent-based filocks that do the same
thing. What's different about this method?
I hope I've addressed your questions adequately. Locks are evil, I
know, but they seem to be doing quite well at the moment. Maybe by the
time DF is ready for production use there will be machines that push
other UNIX implementations beyond their capabilities. But for now,
Linux is a free kernel for over a dozen architectures that scales
better than some proprietary UNIX kernels do on their target
architecture. That says a lot about the success of its design
More information about the Users
mailing list