7-Zip / Bzip2

Ben Cadieux ben.cadieux at gmail.com
Fri May 9 08:58:26 PDT 2008


> Which is not an impediment per-se.  After all it's just a
> userland tool, not a library or even part of the kernel.

Not to nit-pick, but it's LGPL, not GPL, so it's not *so* bad :)

> So, is 7z useful enough to add a fourth compression tool
> to the base system?  And keep it there forever?  (Remember
> that we also have to keep compress(1) for compatibility,
> even though it compresses worse and is slower than gzip,
> so the usefulness is very small.)

I would argue that 'compress' should be removed and put in the ports
tree.  There's always this argument concerning shell scripts that are
still using these tools...who's using one?  If you're swapping to
DragonflyBSD, chances are you're not worried about breaking
compatibility a little, and it's not difficult to put it in the docs
that it's been removed.

> I've given it a quick test and fed a 1 MB logfile to 7z.
> It was only marginally better than bz2 (< 1%), but it was
> noticeably slower.  And bz2 is already painfully slow for
> both compression and decompression.  Not everybody has a

bz2 is painfully slow for compression, not decompression.  Small files
aren't much of a concern for any of this, though.

As I mentioned previously, if you use the compression level argument
with 7z, you can beat bzip2 by compression AND speed in many cases.  I
compressed a 4gb image 90mb smaller and 3 minutes faster than bzip2
with 7z's compression level '4'.

> 3 GHz multicore machine.  That's why I still use gzip most
> of the time -- the compression is a little worse, but it's
> a *lot* faster.

I wasn't suggesting removal of gzip.  Just like not everyone has a 3
GHz multicore machine, not everyone is still using a crappy 486 ---
they shouldn't be limited by the restrictions your particular machine
has.

> There were even cases when people reported that they
> weren't able to decompress a bz2 file on a small system
> (embedded or otherwise), because it required several
> MB of RAM for decompression.  It appears that 7z is even
> worse.  The memory footprint of gunzip is negligible.

RAM usage for compression depends on the level of compression you're
using.  If someone's going to be using DflyBSD for an embedded device
with limited resources, they would tailor it accordingly and remove
7zip and bzip2.  A great many systems don't need half the kernel
modules either, that doesn't mean those should not have been added to
DFly.

> It should also be mentioned that about every other year
> another compression tool pops up that claims to be better
> than all the others.  Last year (or the year before) it
> was "paq", before that it was rzip and lrzip, and so on.
> So this year it is 7z.  What will be the next one?

7-zip has been around since ~2000.  It's not hype.  rzip's features,
having more flexibility than just 100-900k block sizes in bzip2,
should've been available in bzip2 from the start -- unfortunately some
of Unix's best strengths are its greatest weaknesses.  bzip2 can't
simply be altered, otherwise "compatibility" will be broken.  bzip2
would never have been added if gzip could've been modified at whim.

bzip2 vs 7z:
- 7z supports listing decompressed size of contained file(s)
- 7z can compress faster with a better ratio, or a much better ratio but slower
- 7z can create volumes.  while I realize one can use 'split'...try
putting a huge set of volumes back together on windows.  7z has a
really decent windows port, too....and ports on many other OSes.

Best Regards,
Ben Cadieux





More information about the Users mailing list