Plans for 1.5
Matthew Dillon
dillon at apollo.backplane.com
Sat Dec 17 14:24:01 PST 2005
Starting in January, major surgery is going to occur on HEAD. PEOPLE
NEEDING STABLE SYSTEMS SHOULD EXPECT TO STICK WITH THE 1.4 RELEASE
FOR A GOOD 3-4 MONTHS!
-----------------------------------------------
STAGE 1 - Networking MPSAFE work
Starting early January!
* Jeff's parallel routing code will be integrated
* TCP and UDP threads, and the network interrupt code will be made MPSAFE
with the exception of the firewall and IP filter code.
* The mbuf subsystem will get a makeover (Jeff and I will be embedding
a message structure in the mbuf header to avoid a per-packet malloc
and greatly improve performance).
NOTE: I will not personally be working on the firewall/filter code
so once the above items are stable I will be soliciting developers to
work on those subsystems (to make them MPSAFE). The network paths will
not be entirely MPSAFE until those last bits are done.
I am deliberately choosing a chewable piece of work for stage 1 so I can
get it done and move on to stage 2.
-----------------------------------------------
STAGE 2 - I/O Subsystem
Starting early February!
* I will be designing and implementing our cache coherency management
system (CCMS). Since cache coherency must be tightly integrated into
the management of offset ranges for files, block devices, VM pages,
and so forth, this technology will also serve as an offset/range
locking subsystem for high level operations such as read() and
write(), as a locking and cache coherency mechanism for mid-level
subsystems such as the buffer cache, and as a cache coherency
mechanism for the VM system.
This will be a crucial step in allowing us to implement cache-coherency
across physical machines. For example, the ability to have fully
coherent mmap()'d shared memory across physical machines.
CCMS will be vnode-based.
* CCMS will be used to encapsulate read(), write(), and truncate().
* The buffer cache will be converted from being block-number-based to
being offset-range-based and will use CCMS for management.
* The VM subsystem will be converted to using CCMS for management.
To cut down on clutter, ranges of pages may be managed under the
same CCMS lock and may be broken down or merged as necessary to
control kernel memory use.
* All VM Objects will be given vnodes for management purposes, including
anonymous memory objects.
This is a precursor to being able to do process migration across
machines.
* All VNODES will be given a mandatory VM object for CCMS and buffer
cache management purposes.
Alternatively we will do away with the VM_PAGE->VM_OBJECT mapping
entirely and instead simply have a VM_PAGE->BUFFER_CACHE mapping,
which might actually be better.
* read() and perhaps write() will be modified to use CCMS to directly
access VM OBJECTS and/or the buffer cache, allowing cached data to
be directly copied to the user process without having to run
through the VFS subsystem.
* I intend to construct a fully cache-coherent Inter-machine transport
layer for VNODEs using TCP. This will allow NFS-style cross mounting
between DragonFly hosts that are 100% cache coherent.
NOTE: I am deliberately not listing conversion of the I/O subsystem from
mapped buffers to page lists (XIO's) because I feel this might be too big
a piece of bite off with the other things I want to do in this stage,
and I don't want to hold up STAGE 3.
-----------------------------------------------
STAGE 3 - ZFS PORT
Starting March or April!
I am really quite impressed with Sun's ZFS. People wishing to look at
the Sun ZFS code can find it on LEAF in the OpenSolaris source tree:
cd /archive/OpenSolaris-20051207/usr/src/
common/zfs
uts/common/fs/zfs
uts/sparc/zfs
uts/intel/zfs
cmd/fs.d/zfs
cmd/mdb/common/modules/zfs
cmd/mdb/intel/ia32/zfs
cmd/mdb/intel/amd64/zfs
cmd/mdb/sparc/v9/zfs
cmd/zfs
I have perused the ZFS code and I am impressed. Very impressed! It is
well documented, nicely modular, and easy to read.
I am tentatively scheduling the port for March. It may occur sooner or
later then that (hopefully sooner), but I just don't know how hard
STAGE 2 is going to be.
I really want to start the port immediately, but I think I am really
going to need userland VFS working for such a port to be able to proceed
at a good clip.
-Matt
More information about the Kernel
mailing list