[GSOC] HAMMER2 compression feature week5 report

Daniel Flores daniel5555 at gmail.com
Sat Jul 20 10:35:16 PDT 2013


Hello everyone,

Here is my report for week 5. First of all, when the week just started,
with the invaluable help from my mentor, a lot of bugs were fixed in read
and write paths. A bit later yet another small bug, that affected files
that couldn't be compressed, was fixed.

As the result, the compression/decompression feature using LZ4 algorithm
started to work, finally. There is still some bug though, which manifests
itself when large files are being decompressed. What happens is that
sometimes the decompressed file isn't the same as the original, even though
it can be decompressed correctly after remounting the HAMMER2 partition,
which means that the bug is somewhere in read path. However, the whole
feature does work correctly in many cases, which is encouraging. But it's
not ready for use yet and there is a lot testing and bug hunting to be done
during next week and, possibly, beyond that. It's very important for a file
system and all its features to be rock-solid, so there will be a lot of
exhaustive tests.

Another thing I done is optimizing both read and write paths. When I
created them my main goal was just getting them to work, so they were
highly inefficient. Just to give an example, when I checked the
performance, the write path with LZ4 compression was approximately 5,6
times slower than the path without compression and read path with
decompression was approximately 3 times slower. It wasn't LZ4 that caused
this, but all those buffers that I generously used.

Now I got rid of the intermediary buffer in the read path, it decompresses
directly from physical buffer to logical. As the result, the speed of the
read path with decompression is the same as of the read path without it or,
at least, the difference is virtually unnoticeable.

I couldn't get rid of that buffer in the write path for now, but it's used
more efficiently now and, as the result, now the write path with
compression is approximately 2 times slower than the write path without it.
It's not very noticeable for the small files, most of which are compressed
and written in less than a second, but it does make a noticeable difference
for big files. The main cause of slowness seems to be the use of the
buffer, not the compression itself, which is a very efficient algorithm. If
we find a way to get rid of that buffer, I expect a huge increase in speed.

I'll post later the exact timings, when I'll have more results from
different test cases (most likely, in next week's report).

Another thing that was implemented this week is zero-checking, so the
default option for HAMMER2 compression seems to be working now. There is
still a lot of testing to be done on it, so I can't assure that it's
working correctly at this point, but the initial tests are showing
successful result. I also have to incorporate it in 2nd option, which is
LZ4, hopefully I'll do this today.

Next week and the remaining part of the weekend I'll be bug hunting,
testing all the implemented features with more complex cases and also
trying to optimize more the write path. There is also some work to be done
with LZ4 files, because they contain more functions than we need.

I'll appreciate any comments, suggestions and criticism. You can check my
work from my leaf repository, branch "hammer2_LZ4" [1].


Daniel

[1] git://leaf.dragonflybsd.org/~iostream/dragonfly.git
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/kernel/attachments/20130720/b8489661/attachment.html>


More information about the Kernel mailing list