<div dir="ltr"><div>You should use a database, frankly. A HAMMER1 inode is 128 bytes and for small files I think the data will run on 16-byte boundaries. Not sure of that. Be sure to mount with 'noatime', and also use the double buffer option because the kernel generally can't cache that many tiny files itself.<br><br></div><div>The main issue with using millions of tiny files is that each one imposes a great deal of *ram* overhead for caching, since each one needs an in-memory vnode, in-memory inode, and all related file tracking infrastructure.<br><br></div><div>Secondarily, hammer's I/O optimizations are designed for large files, not small files, so the I/O is going to be a lot more random.<br></div><div><br></div>-Matt<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 15, 2015 at 8:58 AM, Michael Neumann <span dir="ltr"><<a href="mailto:mneumann@ntecs.de" target="_blank">mneumann@ntecs.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
Lets say I want to store 100 million small files (each one about 1k in size) in a HAMMER file system.<br>
Files are only written once, then kept unmodified and accessed randomly (older files will be access less often).<br>
It is basically a simple file based key/value store, but accessible by multiple processes.<br>
<br>
a) What is the overhead in size for HAMMER1? For HAMMER2 I expect each file to take exactly 1k when the file<br>
is below 512 bytes.<br>
<br>
b) Can I store all files in one huge directory? Or is it better to fan out the files into several sub-directories?<br>
<br>
c) What other issues I should expect to run into? For sure I should enable swapcache :)<br>
<br>
I probably should use a "real" database like LMDB, but I like the versatility of files.<br>
<br>
Regards,<br>
<br>
Michael<br>
</blockquote></div><br></div>