Real World DragonFlyBSD Hammer DeDup figures from HiFX - Reclaiming more than 1/4th ( 30% ) Disk Space from an Almost Full Drive

Siju George sgeorge.ml at gmail.com
Tue Jul 19 03:55:39 PDT 2011


Hi,

Finally I got free after a long busy season to work on my DragonFlyBSD
Backup Servers.
One of the Backup Server has around 10 years of Company  Archives.

Short Sumary before dedup of firtst Hard Disk

Filesystem                Size   Used  Avail Capacity  Mounted on
Backup1                   454G   451G   2.8G    99%    /Backup1

Short Sumary after dedup of firtst Hard Disk

Filesystem                Size   Used  Avail Capacity  Mounted on
/Backup1/pfs/@@-1:00001   454G   313G   141G    69%    /Backup1/Data

Reclaimed 138 GB i.e 30% of Disk space without deleting anything or
considerably affecting the perfomance of the Server.

Full Story:

The first backups server was Debian Sarge, then Debian Etch and then
OpenBSD with RAIDFRAME mirrors because it was the only Unix/Linux that
would even detect the 120 GB hard disks we had back then.
Later I turned to DragonFlyBSD due to HAMMER ( No fsck, No RAID Parity
chceks and Easy FS Snapshots )
So this Dragonfly backup server has around 10 years old backups of

1) Web files of Projects ( html, php, images etc )

2) SQL dumps both zipped and unzipped .Hammer snapshots gave me the
luxury to do

http://www.dragonflybsd.org/docs/real_time_backup_server_for_microsoft_windows__44___linux__44___bsd_and_mac_os_x_clients/

But now we have SQL dumps of induvidual databses taken every hour and
made available to the developers using snapshots in the same manner
:-)

3) MS Word, Excell Doc files - Company documents and User backups

4) PSD files and such from Designers which takes a larg space.

5) Git, SVN repositories backup

6) Virtual Machine images ( mostly qcow2 )

7) Configuration files of several servers and other details backuped
daily/hourly os some times every 15 minutes and maintained with coarse
grained snapshots without pruning.

8) Several Softwares and CD ISO images

9) Video/Audio files such as mp3,avi.flv,mpg and so on.


The OS version currently is

DragonFly v2.11.0.247.gda17d9-DEVELOPMENT

 Processor is

AMD Athlon(tm) 64 Processor 3400+ (2193.63-MHz 686-class CPU)

Memory is

real memory  = 2113336320 (2015 MB)
avail memory = 2029342720 (1935 MB)

with four 500GB SATA Disks mirroring PFS from each other and also from
another Dragonfly Backup Server on a differrent floor using
'mirror-stream' started at boot using cron with an entry similar to

@reboot /sbin/hammer mirror-stream /Backup1/Data /Backup2/Data &


I have never reinstalled the OS but kept following the development
version from July 2009 so that is two years of rolling release which
is a great advantage in itself :-)

The first Disk is mounted as /Backup1 and seems to be a good Candidate
for dedup because it is almost full.

======================================================================================
Filesystem                Size   Used  Avail Capacity  Mounted on

Backup1                   454G   451G   2.8G    99%    /Backup1
/Backup1/pfs/@@-1:00001   454G   451G   2.8G    99%    /Backup1/Data
/Backup1/pfs/@@-1:00009   454G   451G   2.8G    99%    /Backup1/pkgsrc
/Backup1/pfs/@@-1:00002   454G   451G   2.8G    99%    /Backup1/VersionControl
/Backup1/pfs/@@-1:00003   454G   451G   2.8G    99%    /Backup1/test
/Backup1/pfs/@@-1:00005   454G   451G   2.8G    99%
/Backup1/www-5mbak/www-hot
/Backup1/pfs/@@-1:00006   454G   451G   2.8G    99%
/Backup1/mysql-1hbak/mysql-hot
/Backup1/pfs/@@-1:00007   454G   451G   2.8G    99%
/Backup1/project-docs-bak/project-docs
=======================================================================================

Full Details below.

=========================================================

        Label               Backup1
        No. Volumes         1
        FSID                e182...............................................
        HAMMER Version      4
Big block information
        Total           58140
        Used            57713 (99.27%)
        Reserved           69 (0.12%)
        Free              358 (0.62%)
Space information
        No. Inodes   11350364
        Total size       454G (487713669120 bytes)
        Used             451G (99.27%)
        Reserved         552M (0.12%)
        Free             2.8G (0.62%)
PFS information
        PFS ID  Mode    Snaps  Mounted on
             0  MASTER      0  /Backup1
             1  MASTER      0  /Backup1/Data
             2  MASTER      0  /Backup1/VersionControl
             3  MASTER      0  /Backup1/test
             5  MASTER      0  /Backup1/www-5mbak/www-hot
             6  MASTER      0  /Backup1/mysql-1hbak/mysql-hot
             7  MASTER      0  /Backup1/project-docs-bak/project-docs
             9  MASTER      0  /Backup1/pkgsrc
==========================================================


De Duping Steps Taken:
----------------------------------


1) Version Upgrading from 4 to 6.

=================================
dfly-bkpsrv# hammer version-upgrade /Backup1 5
hammer version-upgrade: succeeded
dfly-bkpsrv# hammer version-upgrade /Backup1 6
hammer version-upgrade: succeeded
=================================

2) Simulating using 'dedup-simulate' to get an idea.

=====================================================================================

dfly-bkpsrv# hammer dedup-simulate /Backup1
Dedup-simulate /Backup1: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 0
Dedup-simulate /Backup1 succeeded
Simulated dedup ratio = 1.07

dfly-bkpsrv# hammer dedup-simulate /Backup1/Data
Dedup-simulate /Backup1/Data: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 1
Dedup-simulate /Backup1/Data succeeded
Simulated dedup ratio = 1.34

dfly-bkpsrv# hammer dedup-simulate /Backup1/pkgsrc
Dedup-simulate /Backup1/pkgsrc: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 9
Dedup-simulate /Backup1/pkgsrc succeeded
Simulated dedup ratio = 1.10

dfly-bkpsrv# hammer dedup-simulate /Backup1/VersionControl
Dedup-simulate /Backup1/VersionControl: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 2
Dedup-simulate /Backup1/VersionControl succeeded
Simulated dedup ratio = 2.79

dfly-bkpsrv# hammer dedup-simulate /Backup1/test
Dedup-simulate /Backup1/test: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 3
Dedup-simulate /Backup1/test succeeded
Simulated dedup ratio = 0.00

dfly-bkpsrv# hammer dedup-simulate /Backup1/www-5mbak/www-hot
Dedup-simulate /Backup1/www-5mbak/www-hot: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 5
Dedup-simulate /Backup1/www-5mbak/www-hot succeeded
Simulated dedup ratio = 1.39

dfly-bkpsrv# hammer dedup-simulate /Backup1/mysql-1hbak/mysql-hot
Dedup-simulate /Backup1/mysql-1hbak/mysql-hot: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 6
Dedup-simulate /Backup1/mysql-1hbak/mysql-hot succeeded
Simulated dedup ratio = 13.78

dfly-bkpsrv# hammer dedup-simulate /Backup1/project-docs-bak/project-docs
Dedup-simulate /Backup1/project-docs-bak/project-docs: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 7
Dedup-simulate /Backup1/project-docs-bak/project-docs succeeded
Simulated dedup ratio = 1.15

===================================================================================================

3) Real 'de-dup' of the Mother File System and all PFSes

=======================================================================

dfly-bkpsrv# hammer dedup /Backup1
Dedup /Backup1: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 0
Dedup /Backup1 succeeded
Dedup ratio = 1.07
      625 MB referenced
      585 MB allocated
      224 KB skipped
           0 CRC collisions
           0 SHA collisions
           0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/Data
Dedup /Backup1/Data: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 1
Dedup /Backup1/Data succeeded
Dedup ratio = 1.34
      259 GB referenced
      193 GB allocated
       40 MB skipped
        1944 CRC collisions
           0 SHA collisions
          20 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/pkgsrc
Dedup /Backup1/pkgsrc: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 9
Dedup /Backup1/pkgsrc succeeded
Dedup ratio = 1.10
     1687 MB referenced
     1539 MB allocated
     1718 KB skipped
           3 CRC collisions
           0 SHA collisions
           0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/VersionControl
Dedup /Backup1/VersionControl: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 2
Dedup /Backup1/VersionControl succeeded
Dedup ratio = 2.75
      160 MB referenced
       58 MB allocated
      853 KB skipped
           0 CRC collisions
           0 SHA collisions
           0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/test
Dedup /Backup1/test: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 3
Dedup /Backup1/test succeeded
Dedup ratio = 0.00
         0 B referenced
         0 B allocated
         0 B skipped
           0 CRC collisions
           0 SHA collisions
           0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/www-5mbak/www-hot
Dedup /Backup1/www-5mbak/www-hot: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 5
Dedup /Backup1/www-5mbak/www-hot succeeded
Dedup ratio = 1.39
       50 GB referenced
       36 GB allocated
       53 MB skipped
         167 CRC collisions
           0 SHA collisions
           0 bigblock underflows

Dedup /Backup1/mysql-1hbak/mysql-hot: objspace 8000000000000000:0000
7fffffffffffffff:ffff pfs_id 6
Dedup /Backup1/mysql-1hbak/mysql-hot succeeded
Dedup ratio = 13.78
       117 GB referenced
     8747 MB allocated
         0 B skipped
           0 CRC collisions
           0 SHA collisions
           0 bigblock underflows

dfly-bkpsrv# hammer dedup /Backup1/project-docs-bak/project-docs
Dedup /Backup1/project-docs-bak/project-docs: objspace
8000000000000000:0000 7fffffffffffffff:ffff pfs_id 7
Dedup /Backup1/project-docs-bak/project-docs succeeded
Dedup ratio = 1.15
      247 MB referenced
      215 MB allocated
      102 KB skipped
           0 CRC collisions
           0 SHA collisions
           0 bigblock underflows
=================================================================================================

Now after de-duping all PFSes on First Disk a 'df -h' gives this details

Filesystem                Size   Used  Avail Capacity  Mounted on
/Backup1/pfs/@@-1:00001   454G   313G   141G    69%    /Backup1/Data

Before de-duping it was

Filesystem                Size   Used  Avail Capacity  Mounted on
Backup1                   454G   451G   2.8G    99%    /Backup1

So that is reclaiming 30% of Disk space amounting to 138 GB :-)

Carefull configuring designing PFSes and snapshots can save a lot of Disk space.
But de-dup can still save more :-)


In order to 'de-dup' the file system automatically every day using
'hammer cleanup' in the periodic script I have put some thing like
this in the configuration files for PFSes.

=============================================
dfly-bkpsrv# hammer config /Backup1/VersionControl/
snapshots 1d 1000d
prune     1d 15m
rebalance 1d 5m
reblock   1d 60m
recopy    30d 60m
dedup     1d 30m
==============================================

A million thanks to Matt and team for DragonFly, Hammer, de-dup,
vkernel and a lot of other gooddies comming up :-D

Thanks and Regards

--Siju





More information about the Users mailing list