MrPete Posted November 10, 2008 Report Share Posted November 10, 2008 Background One of Retrospect's major features is that matching files are not stored twice. The documentation states, "The Windows file matching criteria are name, size, creation date, and modify date." That's great, and makes sense. [Aside: I would be a bit happier if the meta-information were 100% separate from the content, so only size and MD5 changes cause contents to be re-saved, but that's immaterial for this bug report.] Obviously, files on different workstations have different ownership information, so ownership differences should never cause a mismatch. Because of the security requirements of a variety of Windows software packages, we have found (the hard way) we need to set "Backup File Security Information from Workstations" for correct recovery of installed software from backups. Now I'm beginning to suspect that Retrospect isn't as smart as it seems... The Setup We discovered after upgrading a Windows system that thousands of files on two partitions had both the new owner id SIDS as well as an old one that no longer exists in the system (so a look at file security shows permissions for an id of the form S-1-...) In one case, the ownership of an entire tree was S-1-... -- no problem for access to the files but a big problem for changing anything. So we ran a tree-walking script using the Microsoft subinacl.exe utility to delete the invalid security information. Nothing else about the files was touched. They still match on all criteria. Not even the access dates changed, let alone create or modify dates. The Bug The result of changing the ACL's: 50k+ files, 85GB of data, were stored in the next backup. This rather strongly suggests that the Retrospect "matching file" feature does not work as advertised. Resolution I don't know how serious a change will be required to fix this. However, I'd suggest that while this is being looked at, the following improvement be considered: dates should be ignored in file-matching, and file names as well, as long as file metadata is stored separately from file contents. If two people download the exact same file from the web, whether manually or through a software update process, the files should match even though their dates (and possibly names) will differ. If two people install the exact same software package, 90% of the files should match even though their dates may differ. Quote Link to comment Share on other sites More sharing options...
Mayoff Posted November 10, 2008 Report Share Posted November 10, 2008 The documentation states, "The Windows file matching criteria are name, size, creation date, and modify date." In addition we also look at file permissions and meta data. Under Windows, Retrospect looks at the archive bit. If the archive bit changes, then we assume the permission has changed and the file is backed up again. To copy the changed permission, the entire file is backed up again. In the past we looked at permissions as a different matching process but the result was sometimes 10 or 15 hours just to copy the permissions of every file after EVERY single backup. The approach we take now is much faster in 99% of cases. Just ask anyone who used Retrospect 6.5 and 7.0 about the speed to backup a server. If this is a problem, uncheck the option to backup NTFS permissions from files and folders from workstations and servers. Quote Link to comment Share on other sites More sharing options...
MrPete Posted November 10, 2008 Author Report Share Posted November 10, 2008 If I'm hearing correctly, this means that turning on the permissions-backup (for correct recovery) means that matching cannot work across workstations? I.e., since every file has a different owner on each workstation, a separate copy will be backed up on each computer. NOT beating up on anybody here ... I honestly think we can get to a best-of-both-worlds solution with regards to speed. The same algorithm that identifies what needs backing up for 7.5+ can identify when metadata needs to be stored... so that we are not always storing all the metadata as before. I know it's not that simple, just giving a high level perspective. I'm beginning to think a new strategy is needed for effective backup and recovery: * Use normal Backup for data files. * Use Duplicate to maintain an exact copy of every bootable / installed-software partition. Would this eliminate the need for disaster-recovery CD's, "system" storage and such? Quote Link to comment Share on other sites More sharing options...
Mayoff Posted November 10, 2008 Report Share Posted November 10, 2008 means that matching cannot work across workstations? Actually this is not true, Retrospect has special processes in place to allow single instance storage across volumes and clients. Quote Link to comment Share on other sites More sharing options...
MrPete Posted November 10, 2008 Author Report Share Posted November 10, 2008 So, worst case, leaving permissions-backup on means that multiple copies on a single volume, with different ownership, could/would be multiply backed up. That's not so bad at all! Typically, spare copies are on a different volume or workstation. So, as you've said, the key element is that changing the permissions on a set of files will trigger a new backup of those files. My final (?) question before putting this one to bed: if a set of files is updated as described here... where metadata but not content changed... Does that have any impact or interaction with grooming priorities? I'd love to be able to essentially trash the older "wrong" permission copies in favor of the newer one. What I'm thinking is this: for files that don't change at all, it seems a waste to maintain a huge collection of dated copies that are really no different... while using up space that could be better used for dynamically changing file backups. Any thoughts? THANKS MUCHLY! Quote Link to comment Share on other sites More sharing options...
Mayoff Posted November 11, 2008 Report Share Posted November 11, 2008 Grooming will never remove something that existed on the disk on the date of a remaining snapshot. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.