Initial filesystem design synopsis.
check+jdvass00rsmupf9e at fromme.com
Thu Feb 22 07:18:21 PST 2007
Matthew Dillon <dillon at apollo.backplane.com> wrote:
> [new file system design]
I have a question about a specific scenario. You mention
multi-master, replication and off-line operation. Suppose
I have two machines in a replicated multi-master setup,
i.e. each of them has a full copy of the file system to
work with, and both have write access which is replicated
to the partner. This could ba a set of redundant servers,
or maybe a desktop machine plus a notebook.
Now what happens if the two nodes are disconnected for a
certain period of time? E.g. there's a network outage, or
the notebook is taken off-line on a trip. In such a
situation, both nodes should still be fully functional
with write access, of course. Therefore, each node must
maintain a queue of changes that need to be replicated to
the other node as soon as the connection is restored.
When the connecton between the machines is restored, the
file system has to be synchronized somehow. That should
happen automatically without user intervention. Such a
synchronization means that each node has to send its queue
of changes to the other node. However -- and here is the
question -- what happens if there are any conflicts?
For example, what happens the same file has been changed
in different ways on the nodes during off-line operation?
> Plus, I need a name for this baby. I can't use DFS, however much I
> want to, because the term is already over-used.
Well, then use a Greek letter and call it "delta FS". :-)
That would be ΔFS or δfs in HTML, or ΔFS
(upper case) or δFS (lower case) if your browser
doesn't know the Greek letter entity names (but they're
standard since HTML 4.0 which is quite a while).
Given your feature summary, a few interesting abbreviations
can be made, for example:
high-availability replicated distributed file system
high-availability clustered replicated file system
== HACRFS (pronounced "hacker FS", of course)
high-availability multi-master extra reliable file system
The "R" can mean replicated or reliable (or robust),
whatever you prefer, and the "E" can mean extra, eminently,
exceedingly or exceptionally.
PS: Oh, just one thing that you didn't mention in your
feature list ... It would be very useful to support
checksums for all file system data (file data and meta
data), so any form of corruption can be reliably detected
on the file system level. ZFS supports it, and GELI in
FreeBSD has grown support for it, too. They do it on the
block level, I think (i.e. a checksum per file system
block). No more silent corruption by broken hard disks.
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing
Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.
More information about the Kernel