Jump to content
Cygnis

Software compression INCREASES media requirements?!

Recommended Posts

Hi all,

 

I noticed that Retrospect was writing as little as 2.0GB per disk to my DVD+R media, particularly when backing up some application installation files. As it turns out, the "software compression" option was to blame. Here is how I tested it:

 

(1) Created two new Optical backup sets.

 

(2) Selected 6.4GB of data, consisting of application installation files (mostly downloaded from Microsoft's website).

 

(3) Backed up the same data to each of these two sets, one with "Software Compression" enabled and one without.

 

Result: The first set (with compression OFF) wrote 4.4GB to the first disc, then 2.0GB to the second. The second set (with compression ON) wrote 3.1GB to one disc, then 3.0GB to a second, then 0.3GB to a third!

 

I'm guessing that install files aren't very compressible, but at worst, shouldn't they simply be stored at their original size? Why on earth would they end up requiring MORE space with compression enabled?

 

System specs are as follows:

- Retrospect Professional 7.7.325 (with 7.7.3.102 update)

- Windows XP with Service Pack 3

- Verbatim DVD+R 16x media

- Optiarc AD-5560A 8x DVD +/- R burner (not officially supported, therefore custom configuration used)

- Dell Latitute D630C notebook with 2.4GHz Core 2 Duo processor, 4GB RAM, 200GB 7200rpm HDD

 

Any info would be appreciated. I can live without the compression, but I'm curious to know what causes this issue (and if it's due to any error on my part).

 

Thanks in advance.

Share this post


Link to post
Share on other sites

Another point that may be of interest:

 

When I go to Configure -> Backup Sets, each member (disc) has 4.4GB in the "Used" column, suggesting that Retrospect is writing a full 4.4GB to each disc, even though that 4.4GB is holding less actual data.

 

Could the software compression process be adding in some kind of 'filler' data?

Share this post


Link to post
Share on other sites

Just tried the same test with two new Disk backup sets on my Iomega USB external hard drive. Same result!

 

There are 6.31GB of .rdb files in the "compression OFF" backup set, and 9.04GB in the "compression ON" set.

 

This should eliminate the DVD burner as a possible contributing factor.

 

I restored the files from the "compression ON" set, and (thankfully!) they came back at exactly their original sizes.

 

Any thoughts?

Share this post


Link to post
Share on other sites

With the way compression works it tries to compress every file that is being backed up regardless of format. If the format of the file is already heavily compressed you can get negative compression where it makes the file larger and take up more space in the backup. In cases like this it is better to just leave compression off as the program is not smart enough to not compress files that would have negative compression.

Share this post


Link to post
Share on other sites
If the format of the file is already heavily compressed you can get negative compression where it makes the file larger and take up more space in the backup.

Thanks for the info.

 

Is Retrospect unique in this regard? Because when I compress one of these files to ZIP or RAR format, its size decreases (only slightly, but at least it doesn't increase!).

 

Also are your sets encrypted?

No, none of my usual sets or these test sets are encrypted.

Share this post


Link to post
Share on other sites
If the format of the file is already heavily compressed you can get negative compression where it makes the file larger and take up more space in the backup.

Thanks for the info.

 

Is Retrospect unique in this regard? Because when I compress one of these files to ZIP or RAR format' date=' its size decreases (only slightly, but at least it doesn't increase!).[/quote']

No. Lossless compression isn't magic. If it were, then you could "compress" the same file multiple times until the file was only 1 bit long. If both the sender and the receiver of the compressed data only had to choose between one of two 100 GB files, then one bit would be enough to pass that information. In practice, though, the size of a "best algorithm" compression is the entropy of the data.

 

It's all a matter of the characteristics of the data and the compression algorithm. The goal of compression is simply to eliminate redundancy.

 

You might want to research compression algorithms a bit. Wikipedia is a good place to start for basic information.

 

Russ

Share this post


Link to post
Share on other sites

cygnis:

 

Take a bunch of JPEG files and put them into a .zip file. You will notice that the .zip file is LARGER than the sum of the .jpeg files. If you attempt to compress something that is already compressed you will net a larger data set

Share this post


Link to post
Share on other sites
No. Lossless compression isn't magic. If it were, then you could "compress" the same file multiple times until the file was only 1 bit long.

Understood. What surprised me wasn't the fact that the data didn't compress, but the fact that it grew (instead of being stored at its original size), and rather substantially (2.0 -> 4.4 GB in one case).

 

Will definitely read up on compression algorithms as per your suggestion. Thanks!

 

Does anyone know the name/type of the algorithm used by Retrospect?

 

Take a bunch of JPEG files and put them into a .zip file. You will notice that the .zip file is LARGER than the sum of the .jpeg files. If you attempt to compress something that is already compressed you will net a larger data set

Cheers, will give that a try.

Share this post


Link to post
Share on other sites

I'm not sure if writing to optical media has anything to do with this. Maybe you could investigate this by trying to back up that substantially growing item(s) to a file storage set?

 

As Russ explained it's quite possible/normal already compressed data to grow when compressing it again. While most ZIP compression is quite efficient, real time compression algorithms are less efficient. This is logical, because otherwise compression would take too big of a performance hit. The compression algorithm used by Retrospect is probably a run-length encoding (RLE) algorithm.

 

One question though. Is the original data stored in a compressed folder in the operating system (Windows)?

Share this post


Link to post
Share on other sites
I'm not sure if writing to optical media has anything to do with this. Maybe you could investigate this by trying to back up that substantially growing item(s) to a file storage set?

As mentioned above, I also tried a 'Disk' (hard disk) backup set, and the same occurred. So yeah, the use of optical media was apparently not a contributing factor.

 

If I get time I'll also try a 'File' backup set as per your suggestion. My guess is that the same will occur again, but there's no harm in trying.

 

As Russ explained it's quite possible/normal already compressed data to grow when compressing it again. While most ZIP compression is quite efficient, real time compression algorithms are less efficient. This is logical, because otherwise compression would take too big of a performance hit.

That definitely makes sense. Thanks for the info (and for the link/mention of RLE encoding).

 

One question though. Is the original data stored in a compressed folder in the operating system (Windows)?

No, but I suspect that the files' contents are highly compressed. They are large application/OS installation files downloaded from Microsoft, which I'd expect to be compressed as much as possible to conserve their server bandwidth (and make the downloads smaller/faster for their customers).

Share this post


Link to post
Share on other sites

Application/OS installation files downloaded from Microsoft are almost always highly compressed. For example the Windows 7 iso image from MSDN is about 3-4 GB but contains at least 10 GB of files.

I know it might be a pain to download the files again if they are lost however if they causing space issues have you thought about just skipping these files.

Share this post


Link to post
Share on other sites
I know it might be a pain to download the files again if they are lost however if they causing space issues have you thought about just skipping these files.

 

Yeah, I'll probably just burn them to disc (outside of Retrospect) at some point, and then either remove them from my PC or exclude them from my Retrospect backups.

 

Unless I get the time/inclination to test a bunch of other file types, I'll probably just leave software compression off in future. This isn't a big inconvenience to me; I just wanted to gain an understanding of what was going on, and share my experience in case others encounter the same thing.

 

Thanks to everyone for your responses.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×