HAMMER update 23-jan-08

Wed Jan 30 10:30:17 PST 2008

:This is a nasty situation with NFS and we definitely will come up with a =
:
:better way.  I am planning to develop a distributed file system (hopefull=
:y=20
:to be part of DragonFly), and I have some ideas, mainly involving public =
:
:key cryptography.
:
:> Also, what authentication mechanism would be used across nodes in a
:> cluster: NIS, LDAP, or something else?
:
:I think the system should be designed in a way that not every node in the=
:=20
:cluster needs to know about all authentication information.  It should be=
:=20
:possible to establish trust relationships between machines (or specific=20
:users of these machines, of course).  Then a user id wouldn't be unique i=
:n=20
:itself, but would require a qualifier, describing where this user id=20
:originates from.  You'd see users like "corecode at chlamydia.fs.ei.tum.de" =
:
:instead of only "corecode".  Authentication then would run as a part of=20
:the cluster protocol.  That's only my vision, though.  I don't think ther=
:e=20
:is anything set in stone yet.
:
:cheers
:   simon

    Here's my take on it.  First, I think the remote access needs to have
    certain components integrated with the local filesystem itself, and
    part of my design of HAMMER took that into account.

    * Hammer stores uid's and guid's as UUID's.  Right now I just shim
      the standard 32 bit uid/gid into the uuid, but the on-disk structure
      is a 16-byte uuid.

    * Hammer's record store allows out-of-band management data to be
      associated with any given file or directory.  So, e.g. you would
      be able to associate encryption keys with data.

    * Right now "." and ".." are both synthesized (".." in particular),
      but ".." will eventually have to be implemented as a filesystem
      record to support NFS (NFS uses blind inode lookups and can lose
      topological information so there needs to be a way to recover your
      position in the filesystem topology given just an inode number).

    The main issue for a distributed encrypted filesystem is that you
    don't want to have to distribute the same private key to every client
    wishing to share the same dataset.  You want to revoke each client's
    access individually and not have one compromised client compromise
    the actual encryption keys used by other clients.

    A second major issue is that you may want to be able to provide
    storage for clients which otherwise do not wish to trust the server.

    And, finally, we might want to have a local key-pair to encrypt
    (or doubly-encrypt) the physical store itself, allowing either
    the client OR the server admin to 'destroy' the data by wiping their
    private key.

    For HAMMER, the solution is fairly straightforward... we can simply
    associate a public key 'record' with any given file or directory to
    handle local encryption, and we can associate multiple public key
    records with any given file or directory, one for each client,
    to manage access rights for each client.  HAMMER itself would handle
    local encryption and decryption, and the client would have the option
    of installing its private key to allow HAMMER to decrypt the per-client
    data (which exposes the client's private key to the server), or the
    client could just ask HAMMER to decrypt its half using its local key
    and pass the client-encrypted data back to the client for decryption
    (which does not expose the client's private key to the server but
    also means the data cannot be shared with other users).

    This allows:

    * Locally encrypted filesystems with the admin able to destroy the
      filesystem by destroying the private key.

    * Remotely encrypted filesystems where the remote client encrypts
      the data with its public key and the server doubly encrypts it 
      with its local filesystem key.  The server is only able to decrypt
      its half and passes the data back to the client encrypted with
      the client's public key.  The client decrypts the data using its
      private key.

      And p.s. this means the transport protocol does not need to 
      re-encrypt the bulk data, since it is already encrypted.

      This would not be shareable unless the remote client opts to share
      its private key with other remote clients.

    * Remotely encrypted filesystems where the data IS shareable with other
      clients, by having the server treat the client key as a session key
      and only encrypt the actual data with the local filesystem key (or not
      at all).

      In this case the client(s) trust the server's management of the data
      but the server still stores an individual session key for each
      client's access rights, allowing them to be revoked on a
      client-by-client basis.

    That's my take.

						-Matt