[GSOC] Introduction to HAMMER2 compression feature

Daniel Flores daniel5555 at gmail.com
Wed Jun 12 08:13:44 PDT 2013


Hello everyone,

My name is Daniel Flores and my proposal called “HAMMER2 compression
feature” was accepted for this year’s Google Summer of Code. I already
posted the draft of my proposal [1] in this kernel list, so I will not
repeat much of it, but instead I want to focus on some design decisions
that I’ve already made. Since I’m an inexperienced developer at this point,
I’d be happy to receive your suggestions and criticism.

The feature itself consists in that it attempts to compress a HAMMER2
block, which is of 64KB in size. The result should be a block of 32KB,
16KB, 8KB, etc. in size.

Currently I’m searching for the algorithms that are the most appropriate
for a file system. Particularly I’m searching for algorithms that are very
fast; don’t require a lot of memory and processing power and offer fairly
good compression rate (but not necessarily the best compression rate out of
all). Right now I have two candidates – DEFLATE [2] and LZO [3]. Both of
them have some available implementations which I intend to use, as Alex
suggested in his review of my proposal.

DEFLATE seems to be a good choice, because it works with small amounts of
data and has a sliding window of 32KB – just nice for a 64KB block. It is
based on another algorithm – LZ77, which is successfully used in
compression feature for NTFS, so hopefully DEFLATE would be good as well.

LZO seems to be a good choice, because, similarly, it works on small
amounts of data, it is as fast as DEFLATE and was specifically designed to
have a very fast decompression speed.

My initial proposal only covers the implementation of one algorithm, but if
the time allows, I would try to add both of them and maybe add some more.
The design will be modular, so it will not be difficult to change the
algorithm.

My initial strategy, as I already mentioned in my proposal, is to develop,
first, a prototype application to test those algorithms. The application
would take a file, break it in 64KB blocks, compress them one by one, and
then decompress them. After that the resulting uncompressed blocks would be
compared with the original blocks and if they are identical, then it would
mean that the algorithm works correctly.

I will try to compare the algorithms in terms of efficiency and speed, and
choose the best one. I also expect that I’ll have to modify a bit the
implementations, because most likely they won’t produce a block of an exact
size as required, so I would have to add some padding to it.

After the exhaustive testing with the prototype application, I will add the
compression feature in HAMMER2 itself and do real-life testing. The utility
to set up the compression feature would also be done at this point. By the
end of summer, the compression feature should be fully and correctly
implemented and ready for use.


Daniel

[1] – http://lists.dragonflybsd.org/pipermail/kernel/2013-April/031228.html
[2] – http://en.wikipedia.org/wiki/DEFLATE
[3] – http://en.wikipedia.org/wiki/LZO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/kernel/attachments/20130612/ef1aa7d6/attachment-0001.htm>


More information about the Kernel mailing list