Hammer deduplication needs for RAM size
Matthew Dillon
dillon at apollo.backplane.com
Fri Apr 22 13:14:20 PDT 2011
:Hi all,
:
:can someone compare/describe need of RAM size by deduplication in
:Hammer? There's something interesting about deduplication in ZFS
:http://openindiana.org/pipermail/openindiana-discuss/2011-April/003574.html
:
:Thx
The ram is basically needed to store matching CRCs. The on-line dedup
uses a limited fixed-sized hash table to remember CRCs, designed to
match recently read data with future written data (e.g. 'cp').
The off-line dedup (when you run 'hammer dedup ...' or
'hammer dedup-simulate ...' will keep track of ALL data CRCs when
it scans the filesystem B-Tree. It will happily use lots of swap
space if it comes down to it, which is probably a bug. But that's
how it works now.
Actual file data is not persistently cached in memory. It is read only
when the dedup locates a potential match and sticks around in a limited
cache before getting thrown away, and will be re-read as needed.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list