Networked rebuild and self-healing in HAMMER2

Thu Mar 26 07:03:42 PDT 2015

Greetings,

Thanks for the thoroughness (always appreciated). This actually answers additional questions that I had as well. These are interesting times for file systems.

Looking forward!

On 03/26/2015 10:34 AM, Matthew Dillon wrote:
> The idea is to be able to automate it at least so long as spare nodes are
> available.  So if one had a cluster of 3 masters (quorum is thus 2 nodes),
> and 2 additional nodes operating as slaves, then if one of the masters
> fails the cluster would continue to be able to operate with 2 masters until
> the failed master is replaced.  But the cluster would also be able to
> promote one of the slaves (already mostly synchronized) to become a master,
> returning the system to the full 3 masters and making the timing of the
> replacement less critical.
> 
> This alone does not really replace RAIDs.  For a very large storage
> subsystem, each node would be made up of many disks so another layer is
> needed to manage those disks.  The documentation has a 'copies' mechanism
> that is meant to address this, where redundancy is built within each node
> to handle disk failures and to manage a pool of hot replacements.  If a
> disk fails and is taken out, the idea is for there to be sufficient copies
> to be able to rebuild the node without having to access other nodes.  But
> if for some reason there is not a sufficient number of copies then it could
> in fact get the data from other nodes as well.
> 
> For smaller storage systems the cluster component is probably sufficient.
> But for larger storage systems both the cluster component and the copies
> component would be needed.
> 
> One important consideration here is how spare disks or spare nodes are
> handled.  I think it is relatively important for spare disks and spare
> nodes to be 'hot' ... that is, fully live in the system and useable to
> improve read fan-out performance.  So the basic idea for all spares (both
> at the cluster level and the copies level) is for the spares drives to be
> fully integrated into the filesystem as extra slaves.
> 
> Right now I am working on the clustering component.  Getting both pieces
> operational is going to take a long time. I'm not making any promises on
> the timing.  The clustering component is actually the easier piece to do.
> 
> -Matt
> 
> 
> On Wed, Mar 25, 2015 at 3:12 AM, PeerCorps Trust Fund <
> ipc at peercorpstrust.org> wrote:
> 
>> Hi,
>>
>> If I understand the HAMMER2 design documents, one of the benefits that it
>> brings is the ability to rebuild a failed disk using multiple networked
>> mirrors? It seems that it also uses this capability to provide data healing
>> in the event of corruption.
>>
>> If this is the case, are these processes transparent to the user based on
>> some pre-defined failover configuration, or must they be manually set off
>> in the event of a disk failure/corruption?
>>
>> Also, would RAID controllers still be necessary in the independent nodes
>> if there is sufficient and reliable remote replication? Or could a HAMMER2
>> filesystem span the disks in a particular node and have the redundancy of
>> the remote replication provide features that otherwise would come from a
>> RAID controller?
>>
>> Thanks for any clarifying statements on the above!
>>
>> --
>> Mike
>>
>