Real World DragonFlyBSD Hammer DeDup figures from HiFX - Reclaiming more than 1/4th ( 30% ) Disk Space from an Almost Full Drive
Dean Hamstead
dean at fragfest.com.au
Tue Jul 19 05:51:24 PDT 2011
i would be intetested to see how this compares to other dedupliction implementations
D
On 19/07/2011, at 9:06 PM, Siju George <sgeorge.ml at gmail.com> wrote:
> Some Copy Paste mistakes in the first one. Hereis the updated one.
>
> Hi,
>
> Finally I got free after a long busy season to work on my DragonFlyBSD
> Backup Servers.
> One of the Backup Server has around 10 years of Company Archives.
>
> Short Sumary before dedup of firtst Hard Disk
>
> Filesystem Size Used Avail Capacity Mounted on
> Backup1 454G 451G 2.8G 99% /Backup1
>
> Short Sumary after dedup of firtst Hard Disk
>
> Filesystem Size Used Avail Capacity Mounted on
> Backup1 454G 313G 141G 69% /Backup1
>
> Reclaimed 138 GB i.e 30% of Disk space without deleting anything or
> considerably affecting the perfomance of the Server.
>
> Full Story:
>
> The first backups server was Debian Sarge, then Debian Etch and then
> OpenBSD with RAIDFRAME mirrors because it was the only Unix/Linux that
> would even detect the 120 GB hard disks we had back then.
> Later I turned to DragonFlyBSD due to HAMMER ( No fsck, No RAID Parity
> chceks and Easy FS Snapshots )
> So this Dragonfly backup server has around 10 years old backups of
>
> 1) Web files of Projects ( html, php, images etc )
>
> 2) SQL dumps both zipped and unzipped .Hammer snapshots gave me the
> luxury to do
>
> http://www.dragonflybsd.org/docs/real_time_backup_server_for_microsoft_windows__44___linux__44___bsd_and_mac_os_x_clients/
>
> But now we have SQL dumps of induvidual databses taken every hour and
> made available to the developers using snapshots in the same manner
> :-)
>
> 3) MS Word, Excell Doc files - Company documents and User backups
>
> 4) PSD files and such from Designers which takes a larg space.
>
> 5) Git, SVN repositories backup
>
> 6) Virtual Machine images ( mostly qcow2 )
>
> 7) Configuration files of several servers and other details backuped
> daily/hourly os some times every 15 minutes and maintained with coarse
> grained snapshots without pruning.
>
> 8) Several Softwares and CD ISO images
>
> 9) Video/Audio files such as mp3,avi.flv,mpg and so on.
>
>
> The OS version currently is
>
> DragonFly v2.11.0.247.gda17d9-DEVELOPMENT
>
> Processor is
>
> AMD Athlon(tm) 64 Processor 3400+ (2193.63-MHz 686-class CPU)
>
> Memory is
>
> real memory = 2113336320 (2015 MB)
> avail memory = 2029342720 (1935 MB)
>
> with four 500GB SATA Disks mirroring PFS from each other and also from
> another Dragonfly Backup Server on a differrent floor using
> 'mirror-stream' started at boot using cron with an entry similar to
>
> @reboot /sbin/hammer mirror-stream /Backup1/Data /Backup2/Data &
>
>
> I have never reinstalled the OS but kept following the development
> version from July 2009 so that is two years of rolling release which
> is a great advantage in itself :-)
>
> The first Disk is mounted as /Backup1 and seems to be a good Candidate
> for dedup because it is almost full.
>
> ======================================================================================
> Filesystem Size Used Avail Capacity Mounted on
>
> Backup1 454G 451G 2.8G 99% /Backup1
> /Backup1/pfs/@@-1:00001 454G 451G 2.8G 99% /Backup1/Data
> /Backup1/pfs/@@-1:00009 454G 451G 2.8G 99% /Backup1/pkgsrc
> /Backup1/pfs/@@-1:00002 454G 451G 2.8G 99% /Backup1/VersionControl
> /Backup1/pfs/@@-1:00003 454G 451G 2.8G 99% /Backup1/test
> /Backup1/pfs/@@-1:00005 454G 451G 2.8G 99%
> /Backup1/www-5mbak/www-hot
> /Backup1/pfs/@@-1:00006 454G 451G 2.8G 99%
> /Backup1/mysql-1hbak/mysql-hot
> /Backup1/pfs/@@-1:00007 454G 451G 2.8G 99%
> /Backup1/project-docs-bak/project-docs
> =======================================================================================
>
> Full Details below.
>
> =========================================================
>
> Label Backup1
> No. Volumes 1
> FSID e182...............................................
> HAMMER Version 4
> Big block information
> Total 58140
> Used 57713 (99.27%)
> Reserved 69 (0.12%)
> Free 358 (0.62%)
> Space information
> No. Inodes 11350364
> Total size 454G (487713669120 bytes)
> Used 451G (99.27%)
> Reserved 552M (0.12%)
> Free 2.8G (0.62%)
> PFS information
> PFS ID Mode Snaps Mounted on
> 0 MASTER 0 /Backup1
> 1 MASTER 0 /Backup1/Data
> 2 MASTER 0 /Backup1/VersionControl
> 3 MASTER 0 /Backup1/test
> 5 MASTER 0 /Backup1/www-5mbak/www-hot
> 6 MASTER 0 /Backup1/mysql-1hbak/mysql-hot
> 7 MASTER 0 /Backup1/project-docs-bak/project-docs
> 9 MASTER 0 /Backup1/pkgsrc
> ==========================================================
>
>
> De Duping Steps Taken:
> ----------------------------------
>
>
> 1) Version Upgrading from 4 to 6.
>
> =================================
> dfly-bkpsrv# hammer version-upgrade /Backup1 5
> hammer version-upgrade: succeeded
> dfly-bkpsrv# hammer version-upgrade /Backup1 6
> hammer version-upgrade: succeeded
> =================================
>
> 2) Simulating using 'dedup-simulate' to get an idea.
>
> =====================================================================================
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1
> Dedup-simulate /Backup1: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 0
> Dedup-simulate /Backup1 succeeded
> Simulated dedup ratio = 1.07
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/Data
> Dedup-simulate /Backup1/Data: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 1
> Dedup-simulate /Backup1/Data succeeded
> Simulated dedup ratio = 1.34
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/pkgsrc
> Dedup-simulate /Backup1/pkgsrc: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 9
> Dedup-simulate /Backup1/pkgsrc succeeded
> Simulated dedup ratio = 1.10
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/VersionControl
> Dedup-simulate /Backup1/VersionControl: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 2
> Dedup-simulate /Backup1/VersionControl succeeded
> Simulated dedup ratio = 2.79
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/test
> Dedup-simulate /Backup1/test: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 3
> Dedup-simulate /Backup1/test succeeded
> Simulated dedup ratio = 0.00
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/www-5mbak/www-hot
> Dedup-simulate /Backup1/www-5mbak/www-hot: objspace
> 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 5
> Dedup-simulate /Backup1/www-5mbak/www-hot succeeded
> Simulated dedup ratio = 1.39
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/mysql-1hbak/mysql-hot
> Dedup-simulate /Backup1/mysql-1hbak/mysql-hot: objspace
> 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 6
> Dedup-simulate /Backup1/mysql-1hbak/mysql-hot succeeded
> Simulated dedup ratio = 13.78
>
> dfly-bkpsrv# hammer dedup-simulate /Backup1/project-docs-bak/project-docs
> Dedup-simulate /Backup1/project-docs-bak/project-docs: objspace
> 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 7
> Dedup-simulate /Backup1/project-docs-bak/project-docs succeeded
> Simulated dedup ratio = 1.15
>
> ===================================================================================================
>
> 3) Real 'de-dup' of the Mother File System and all PFSes
>
> =======================================================================
>
> dfly-bkpsrv# hammer dedup /Backup1
> Dedup /Backup1: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 0
> Dedup /Backup1 succeeded
> Dedup ratio = 1.07
> 625 MB referenced
> 585 MB allocated
> 224 KB skipped
> 0 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
>
> dfly-bkpsrv# hammer dedup /Backup1/Data
> Dedup /Backup1/Data: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 1
> Dedup /Backup1/Data succeeded
> Dedup ratio = 1.34
> 259 GB referenced
> 193 GB allocated
> 40 MB skipped
> 1944 CRC collisions
> 0 SHA collisions
> 20 bigblock underflows
>
> dfly-bkpsrv# hammer dedup /Backup1/pkgsrc
> Dedup /Backup1/pkgsrc: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 9
> Dedup /Backup1/pkgsrc succeeded
> Dedup ratio = 1.10
> 1687 MB referenced
> 1539 MB allocated
> 1718 KB skipped
> 3 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
>
> dfly-bkpsrv# hammer dedup /Backup1/VersionControl
> Dedup /Backup1/VersionControl: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 2
> Dedup /Backup1/VersionControl succeeded
> Dedup ratio = 2.75
> 160 MB referenced
> 58 MB allocated
> 853 KB skipped
> 0 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
>
> dfly-bkpsrv# hammer dedup /Backup1/test
> Dedup /Backup1/test: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 3
> Dedup /Backup1/test succeeded
> Dedup ratio = 0.00
> 0 B referenced
> 0 B allocated
> 0 B skipped
> 0 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
>
> dfly-bkpsrv# hammer dedup /Backup1/www-5mbak/www-hot
> Dedup /Backup1/www-5mbak/www-hot: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 5
> Dedup /Backup1/www-5mbak/www-hot succeeded
> Dedup ratio = 1.39
> 50 GB referenced
> 36 GB allocated
> 53 MB skipped
> 167 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
>
> Dedup /Backup1/mysql-1hbak/mysql-hot: objspace 8000000000000000:0000
> 7fffffffffffffff:ffff pfs_id 6
> Dedup /Backup1/mysql-1hbak/mysql-hot succeeded
> Dedup ratio = 13.78
> 117 GB referenced
> 8747 MB allocated
> 0 B skipped
> 0 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
>
> dfly-bkpsrv# hammer dedup /Backup1/project-docs-bak/project-docs
> Dedup /Backup1/project-docs-bak/project-docs: objspace
> 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 7
> Dedup /Backup1/project-docs-bak/project-docs succeeded
> Dedup ratio = 1.15
> 247 MB referenced
> 215 MB allocated
> 102 KB skipped
> 0 CRC collisions
> 0 SHA collisions
> 0 bigblock underflows
> =================================================================================================
>
> Now after de-duping all PFSes on First Disk a 'df -h' gives this details
>
> Filesystem Size Used Avail Capacity Mounted on
> Backup1 454G 313G 141G 69% /Backup1
>
> Before de-duping it was
>
> Filesystem Size Used Avail Capacity Mounted on
> Backup1 454G 451G 2.8G 99% /Backup1
>
> So that is reclaiming 30% of Disk space amounting to 138 GB :-)
>
> Carefull configuring designing PFSes and snapshots can save a lot of Disk space.
> But de-dup can still save more :-)
>
>
> In order to 'de-dup' the file system automatically every day using
> 'hammer cleanup' in the periodic script I have put some thing like
> this in the configuration files for PFSes.
>
> =============================================
> dfly-bkpsrv# hammer config /Backup1/VersionControl/
> snapshots 1d 1000d
> prune 1d 15m
> rebalance 1d 5m
> reblock 1d 60m
> recopy 30d 60m
> dedup 1d 30m
> ==============================================
>
> A million thanks to Matt and team for DragonFly, Hammer, de-dup,
> vkernel and a lot of other gooddies comming up :-D
>
> Thanks and Regards
>
> --Siju
>
More information about the Users
mailing list