Jump to content

Retrospect's optimization -- risky business!


whm

Recommended Posts

On page 165 of the user's guide it says "[Retrospect] optimizes backups by backing up only one file when it finds multiple files with the same name, size, creation date, and modification time."

 

 

 

This optimization bets the user's data on the chance that there will never be two files with the same name, size, creation date, and modification time. Such a duplication is unlikely in many cases but the lack of duplication can't be guaranteed and it's not hard to imagine a situation where such duplicates are very likely.

 

 

 

Is there a way to disable that optimization?

Link to comment
Share on other sites

The odds of having two files with identical:

 

Name

 

Size

 

Creation Date and Time (including seconds)

 

Modify Date and Time (including seconds)

 

Type

 

Creator

 

Label

 

 

 

without those items being the identical file are almost impossible. In any case, you can go to options>More Choices and uncheck all of the "matching" options. This will copy every file, every time even if it has not changed.

Link to comment
Share on other sites

Thanks for that note -- I should have looked at those options more closely!

 

 

 

But I still think that characterizing files as being duplicates based on matching name/size/times is a dangerous optimization to have enabled by default. It's a nice option to have available but it's a heuristic that may fail.

 

 

 

I think the Retrospect documentation is usually great about warning of potential hazards but the documentation seems woefully short on this point. My search of the documentation found only two mentions of it, one on page 148 and another on page 165. Neither is flagged as a warning.

 

 

 

I think that this default behavior combined with the lack of prominent warning is a disaster waiting to happen for some unlucky system administrator.

 

 

Link to comment
Share on other sites

On Friday 2/1/02, because of this optimization, Retrospect restored the wrong data to a file. That was during a test of Retrospect, not an actual emergency data recovery situation, but nonetheless, Restrospect got things wrong.

 

 

 

Imagine a data collection program, running on each of several machines, that writes a fixed amount of data every hour on the hour. The data file has the same name and is in the same place on each system. It seems that by default, Retrospect would incorrectly consider all those data files to be duplicates.

 

 

Link to comment
Share on other sites

The size and modification time (to the second) would be identical in the contrived, but entirely practical situation I envision. Picture N machines each having a file named c:\logger.dat. Each and every file has the same size (say 50,000 bytes) and modification date (say 2/3/02 18:00:00). I believe that by default, Retrospect would only back up one of those files, even though each may have different contents.

 

 

 

This problem is history for me -- I've written these notes simply in the hope of alerting other users to this potential problem and perhaps influence Dantz to add a warning to the documentation or change the default behavior.

 

 

 

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...