Jump to content

Retrospect keeps backing up files that are linked to


hevanw

Recommended Posts

I have the following situation. I have an external drive on my Server that has a ton of media files on them. For organization sake, there are a few subdirs on the drive that have hard links to the actual media files. The total number of files linked to is exactly 1698, worth around 1.5TB.

I have a dedicated backup set for this external drive, with it's own Selector and own backup script. The selector does exclude the subdirs that contain the links, since the files are already backed up from their normal location. The script has 'block level incremental backup' enabled. Compression is off since these are media files that don't compress well.

Now the weird thing: every time I run my Retrospect backup via the script, it finds those exact 1698 files to backup. Backup takes many hours with a total 1.5TB to be backed up and verified. After the backup has completed, I find that the new backup consumed around 250GB on my NAS. The external disk has not been touched so none of those 1698 files have changed one bit, and neither were the links modified.

Does anyone have an idea why Retrospect is behaving like this ? First, it should not even backup those files, since they are never touched. But even if it does a backup, the resulting snapshot should be minimal given that nothing has modified.

EDIT: I realize I'm actually not quite sure what sort of links these are given these are all Windows systems with NTFS filesystems. I have a Perl script that creates the links with link($to,$from) running in Cygwin. 

Link to comment
Share on other sites

Check the dates of the files. If one of the dates are inte future, they will always be seen as modified since the last backup.

Also check the dates of the containing folder(s).

 

Come to think of it, if the links are (re)created every day or at every boot, that is probably the answer.

Link to comment
Share on other sites

Thanks, but it is a negative to all questions.

* All files (and the links which inherit the target file timestamp) have Last Modified dates well in the past.

* The folders themselves also have old Last Modified dates.

* The links have not been touched (recreated) in 2 months.

Link to comment
Share on other sites

It's not just a question of modified dates. It can be any date.

I have seen files where the creation date is later than the modified date. I know it should not happen, but it did.

The following was written years ago. I expect more file meta-data is being checked nowadays, especially changes in ACL/ACE. https://docs.microsoft.com/en-us/windows/desktop/secauthz/access-control-lists

Quote

 

Retrospect uses several matching criteria to find new or changed files. If one of the criteria has been changed, Retrospect will back up the file again. On Windows, Retrospect looks at creation date and time, modified date and time, size and name. If match only in same location option is set, Retrospect matches on the path, volume name and drive letter also.

By default, the archive attribute is not used as a matching criteria in Windows, allowing for true and reliable backups to multiple backup sets.

 

 

Link to comment
Share on other sites

heavanw,

First, you fail to state either your version of Retrospect Windows or your version of Windows.  You ought to know by now that doing so is in most cases a necessity for our giving you any kind of help on these Forums.  However the cumulative Release Notes for Retrospect Windows 15.6.1.104 do not show any recent fixes to Selectors.

Second, your problem with excluding via a Selector not working sounds similar to the problem seanbreilly has reported starting with this post.  Ignore the fact that he tacked his posts onto an existing thread on a Forum dealing with an obsolete version of Retrospect Mac; he reports using "Retrospect (15.1.2.1) [whatever that is] in my post, but not my Windows, which is Windows 10".  Note the suggestions in the P.S. of my 7 January 2019 post below his.

 

Link to comment
Share on other sites

2 hours ago, Lennart_T said:

It's not just a question of modified dates. It can be any date.

I have seen files where the creation date is later than the modified date. I know it should not happen, but it did.

The following was written years ago. I expect more file meta-data is being checked nowadays, especially changes in ACL/ACE. https://docs.microsoft.com/en-us/windows/desktop/secauthz/access-control-lists

 

 

The point is. I run the backup. Then after it completes, I immediately run it again, even though nothing was done on that external drive in the meantime. Yet, the backup again manages to find the same set of files to be backed up, and when backing up, the backup set increases with about 250GB. Since I do block incremental, and I'm 100% certain the contents of the files have not changed, 250GB is a weird size for a 1.5TB set of video files (I now noticed, it's only video). Clearly it's doing some block level deduplication, but if it's a matter of some attributes having changed, I should not have 250GB from 1700 files.

Version btw is 12.6.1.101.

I'll remove the links and see what it does then.

Link to comment
Share on other sites

heavenw,

Maybe there's a bug because you're doing block-level incremental on an external drive.  I noticed in the cumulative Release Notes for the very first release of Retrospect Windows 10 "Fixed: Fix for backing up system file hard links as separate files (#4958)", but maybe it wasn't fixed for external drives.  However, even though I'm a Retrospect Mac administrator, I realize these probably don't qualify as "system files".

The following would probably be a pain in the butt, but have you considered defining everything on your external drive as a set of Retrospect Subvolumes—and then having your dedicated script not backup the Subvolume that contains these sub-directories?

Link to comment
Share on other sites

Ok, I'm onto something although I can't quite find yet what is happening. 


I inspected the previous snapshots to see what was in it, and there I can clearly see that the 1698 files that are backed up either compress to 99%, or do not compress at all. It turns out that all MP4 files do compress to 99% which obviously means that Retrospect somehow identifies them as already backed up. However, other file formats do not compress at all and basically Retrospect backs up the file entirely. This explains why I'm 'only' ending up with a 250GB increase for a 1.5TB file set.

It almost looks like some process is touching these files, though I cannot imagine what it would be and why the files would be modified. Not even sure they are modified to begin with. One of the things I can test is to take an MD5 checksum of a file before running Retrospect at multiple times and see whether the MD5 stays the same or not.

Link to comment
Share on other sites

More testing done...

When the hardlinks are not there, a 2nd (and subsequent) Retrospect backup no longer does duplicate backups, so is behaving fine. However, as soon as the hardlinks are there, it will backup again. Also if the hardlinks are removed then, it will again do 1 backup, after which it's fine again.

I was thinking of a workaround of removing the links prior to doing the backup, but so this would not work, since the moment you set the hardlinks, Retrospect will force backup these files, even if the hardlink was already removed again.

Link to comment
Share on other sites

Ok, problem solved (or worked around).

I followed the advice from this old thread, even though this is a NTFS drive under Windows : 

 But I also removed all the tickboxes under the Windows Security backup options in the script.

Disabling both (which I don't really  need anyway) does solve the problem and a re-run doesn't backup anything anymore that was already backed up before.

 

  • Thanks 1
Link to comment
Share on other sites

2 hours ago, hevanw said:

Ok, problem solved (or worked around).

I followed the advice from this old thread, even though this is a NTFS drive under Windows : 

 But I also removed all the tickboxes under the Windows Security backup options in the script.

Disabling both (which I don't really  need anyway) does solve the problem and a re-run doesn't backup anything anymore that was already backed up before.

 

heavanw,

To help other Retrospect Windows administrators who might have  the same problem in the future, please be so kind as to post exactly which Windows Security backup options you checked and un-checked.  Those would be, I think, on pages 379-382 of the Retrospect Windows 12 User's Guide.

Thanks!

Link to comment
Share on other sites

Sure.
It's the 4 options at the top of page 380. All 4 are on by default, and I turned all of them off as I didn't really need the permissions, etc to be backed up.

Since I did disable both these 4 options, as well as the Unix Client option "Use status modification date when matching", I'm not sure which of both are the true fix for my problem. I'll do another quick test tonight to see which one really fixed it.

Link to comment
Share on other sites

Ok, typical, after a reboot (Windows 10 update), I figured I give it a test. Obviously I now cannot reproduce it with the turning on the above settings. So it still must be some combination of things that lead to the issue, but with the settings above turned off, it definitely solved the problem.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...