Jump to content
Tinsun

Back up OS X /Users/ without scanning the entire HD?

Recommended Posts

Hi!

This has to some extent been discussed in Topic#27292, but my question is slightly different:

Since I have about 100 OS X machines that I want to backup, I'd rather add them to folders, and then set the /Users/ folder to be backed up on all of them. This, of course, is doable either by using path or folder name, but it takes a lot of extra time scanning the entire volume each time.

 

What I want to do is that it ONLY scans and backups the /Users folder, skipping the rest of the / folders. Is that possible, and if so, how?

Share this post


Link to post
Share on other sites

You create a subvolume that contains just the Users folder.

You have to do that for each client, but it's quite simple. (You do that on the Server, see the manual).

Then you create a script that backs up the subvolumes only.

Share this post


Link to post
Share on other sites

I know it's quite simple (albeit time consuming), but that's what I wanted to avoid. In this day and age, I would assume that setting a default folder for a set of computers rather than individually would be possible, especially since I surmise it's the most common folder to backup for OS X macs. Although it seems it is not at all possible?

Scanning through all the files is wholly unnecessary in my case, and seeing as OS X contains 100s of thousands of more files than OS 9 did, well, that is something I want to avoid.

On the whole, I feel Retrospect is much the same as when I ran it ten years ago, nothing more, nothing less. What about saving filter rules, for example, so they can be used for several backup sets?

 

But thanks for your reply.

Share this post


Link to post
Share on other sites
What about saving filter rules, for example, so they can be used for several backup sets?

When using a filter (or "selector" as it's called), the entire hard drive is scanned and THEN the selector is applied

Share this post


Link to post
Share on other sites

The Retrospect feature of cross-client file de-duplication should result in each OSX system file only being backed up once in total across all 100 machines.

 

Not only is this incredibly space saving, it also means you have a FULL backup of each and every machine, including all it's apps.

 

Yes, each machine will be scanned in full, but it is feasible and efficient to allow full backups using the "All Files Except Cache Files" selector for maximum protection.

Share this post


Link to post
Share on other sites
it is feasible and efficient to allow full backups using the "All Files Except Cache Files" selector for maximum protection.

Well, I'd check that selector if I were you. It was wrong as distributed on the Retrospect Mac 6.x CD for apps under Mac OS 10.4.x. Tweaking was required for the locations of cache files.

 

Russ

Share this post


Link to post
Share on other sites
The Retrospect feature of cross-client file de-duplication should result in each OSX system file only being backed up once in total across all 100 machines.

 

Unfortunately, this feature does not work presently and as far as I know has never worked, at least on the Windows server. :(

Share this post


Link to post
Share on other sites

This is getting OT, but I know for sure this feature works on windows servers... our backups would be at least twice as large if it didn't. What makes you think it doesn't?

Share this post


Link to post
Share on other sites
This is getting OT, but I know for sure this feature works on windows servers... our backups would be at least twice as large if it didn't. What makes you think it doesn't?

 

Are really, really sure it works?

 

Ironically, I'd reckon your backups would have to be even larger than what you're supposing. For every computer you backup will probably have some common set of identical files in the Windows and Program Files directories. And if you had 25 computers in a backup set, then that's 24 duplicated sets of files. OTOH, if the user files on these clients were very sizable, you wouldn't notice the size of the common files as much.

 

Anyway, at one point a while ago, a user here claimed Retrospect to have this feature, http://forums.dantz.com/showtopic.php?tid/25257/post/97758/hl//fromsearch/1.

 

Since that contradicts my years of experience with Retrospect, I decided to test it out, so I rigged up a copy of Windows on my Mac, then cloned this copy. Then I installed Retrospect on each of them and backed up the first one to a fresh disk set. So now when I back up the second one, according to the believers in this feature, I should see little increase in the size of the backup, but, indeed, it doubled in size. Just in case this feature was was somehow intentionally constrained to user files, I copied a big file to each of them on the desktop and when I ran the backups again the size of backup indicated that the file was copied twice. I was also watching Retrospect during the backups of the second client and there was no question it didn't care that it backed up everything before not less than the hour prior.

Share this post


Link to post
Share on other sites

we've got 500GB worth of different servers full backup data, goes to 690GB worth of backup sets, each server in it's own set. This trims down to 225GB when they all get done to the one set for one offsite backup. So the cross client de-dupe does exist and work, but it is possible to get test results that say otherwise.

Share this post


Link to post
Share on other sites
we've got 500GB worth of different servers full backup data, goes to 690GB worth of backup sets, each server in it's own set. This trims down to 225GB when they all get done to the one set for one offsite backup. So the cross client de-dupe does exist and work, but it is possible to get test results that say otherwise.

How does this math work? If the total of one copy of everything is 225 GB, then the backup of both together couldn't be more than 500 GB unless you're referring to multiple backups to get 690 GB. And it sounds like an awful lot of duplicated data. Are you by chance compressing the 225 GB set and not the others?

 

Anyway, it's pretty interesting that you can get this behavior given how trivial it is for me to reproduce that it doesn't work. How exactly does the 225 GB set get made? A fresh backup or are you doing snapshot transfers?

Share this post


Link to post
Share on other sites

snapshot transfers. multiple servers, same OS. The 690 is the total space used including history. Total server volume is closer to 400, and we do exclude some stuff (such as WSUS downloads) from making it into the 225, but by and large cross-client matching does work...

Share this post


Link to post
Share on other sites
but by and large cross-client matching does work...

 

...under specific circumstances. Those are snapshot transfers, so it might be the case that is working when it merges snapshots, but clearly it does not work on ordinary disk-set backups. Eventually when I get some time, I will try to test this.

Share this post


Link to post
Share on other sites

I ran numerous tests on cloned systems to determine the story behind Retrospect and deduplication.

 

First, I got a different result. :blush: Retrospect did, in fact, deduplicate the data files. I don't know why it didn't do that the first time I tested it. It copied twice each time. Perhaps there was some funky going on that I just didn’t pick up on it or I made a mistake in the test.

 

But this time, I tried repeatedly with different data files to trip it up, and each time it deduplicated them.

 

And it also deduplicated data files when merging snapshots as MRIS has noted.

 

But if you're reading carefully, you know there is a but :tongue2:

 

Each and every time, it gave special treatment to the Windows directory and copied its contents verbatim, with no deduplication. I had no trouble "tricking" Retrospect to copy “data†files in the Windows directory twice. This arrangement can’t be a bug; it has to be intentional. That’s incidentally how I noticed that lack of deduplication. Many of my client desktops have only Windows and Program Files on C: and the data on another partition.

 

So the truth is that for Windows clients, we have a feature that works, but to a point. I'm not really sure the way it's rigged is as helpful as it could be. After all, every Windows client is going to have a Windows directory, and I can't believe that the files here are somehow so special they can’t be subjected to deduplication. We are, after all, trusting Retrospect to get it right with our own data!

 

At the same time, how many administrators have users with identical, de-duplicate-able files distributed among their clients other than the contents of the Program Files directory?

 

Unfortunately, I didn't test a Mac. The /System directory is fairly large and there is a similarly large /Library directory. If Retrospect avoids them, too, that's a lot of deduplication we're missing out on.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×