cvs commit: src/lib/libc/sys syslink.2

Wed Apr 4 10:37:08 PDT 2007

:any thoughts towards 'split brain' or 'fragmented brain' so far,
:or will that come in a layer above syslink? (e.g. syslink is the
:'overlay network' for the actual 'application' of clustering)
:
:anyhow .. much more clever thought than I'm currently capable of,
:so I'm sure you'll figure things out :)

    You mean dealing with the situation where the network bisects and
    not all the pieces of the cluster can communicate with each other?
    It would depend on where the physical resources wind up in relation
    to the running code.  Some portions of the cluster would still be
    able to run while other portions would likely stall until the
    cluster comes back together.

    Dealing with things like fail-over would have to be handled by higher
    level software, or by running concurrent, redundant services and
    falling back to an older copy of the related data when the primary
    services fail.

    It would really depend on the types of services being run but having
    infrastructure which supports fall backs, such as a filesystem capable
    of infinite snapshots, really helps a lot.  Ultimately the state of
    services running on a machine are be based on data stored in a
    filesystem.  Temporarily rolling back the filesystem and restarting
    the services as a means to deal with a fail-over situation is an
    almost universally applicable solution though it often means a lot
    of manual work to re-merge the lost data when the cluster comes back
    together.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>