Jump to content

disk media set - best practices regarding members, grooming?


Recommended Posts

i have a backup set which is used to backup a relatively small (i assume) number of different user's mac laptops (half a dozen). i do backups multiple times per week. my desire is to keep as long a history going back in time as my disk space will allow (at least several years).

 

i have media sets that have grown to have over 10M files and several TB in size. it is really hard for me to gauge where that falls on the scale of "small" to "large". it seems like a lot to me, but i know retrospect is being used in companies to backup hundreds of users, so i guess (hope) it is that this is well within the design limits of retrospect.

 

some questions:

 

1. given a desire to store x files and y bytes in a disk mediaset, is there a point where it is distinctly better to have multiple smaller members vs. a single big member?

 

2. i understand that it is possible to enlarge a disk media set either by adding a new member or by enlarging an existing member. i also believe that it may be possible to reduce the size of a media set by reducing the size of a member when the member has excess capacity. i have a situation where for one of my media sets, i would actually like to reclaim some storage - can i set the size of a member to a lower value than the current "used" space and then have retrospect shrink the member next time it does a groom?

 

i would understand if this were not supported. i observe, however, that retrospect will let me enter a lower value (using the edit function for an existing media set and setting its size to a smaller value). if "shrinking" a mediaset were not possible, one might expect that a simple check would be done to prevent entering a value smaller than the current size. when i do this, there is a long delay, after which it appears in the media set list view that the media set has been resized so that its "capacity" and its "used" values are the same and "free" is zero. however, if i inspect the member again (by using the edit button), it appears that the member size is unchanged. so in addition to these two values being inconsistent, neither is what i really wanted.

 

3. is grooming is really a "workable" feature for me to use in this situation? ideally, i would like to just provide retrospect with a storage budget and then keep backing up user files until i reach that budget, then start thinning or deleting files. based on the manual, i think this is roughly what grooming is supposed to do, however this usage pattern does not seem to match the example backup strategies that i see discussed, which usually involve periodic recycling of media sets. specifically, i'm wondering if possibly 10M files is above some reasonable threshold for this. one reason i ask this is that with grooming turned on, i find that my catalog files become really large - upwards of 50GB. intuition tells me that doing anything with a file that large is going to take a long time to process or "groom". anyone have any experiences they can relate about whether grooming is workable in this situation?

 

4. how does grooming really work on an ongoing basis? if my mediaset gets full and the grooming option is turned on, what happens? is grooming a big batch operation that gets invoked "once in a while" to free up a big chunk of space? or is grooming an incremental operation that can delete files incrementally to reclaim just the space it needs to add new incoming files to a mediaset? when i hit the "groom" button, i observe that it takes a very very long time to finish, but i don't know if that is the same as what happens when grooming is invoked when a mediaset fills up during execution of a backup script. if i reach my capacity on a mediaset, should i expect to incur this kind of delay on each backup? or is it only the first grooming operation that takes a long time?

Link to comment
Share on other sites

Here is how we do it:

Backup every night to five different Disk media sets. The media sets are groomed every weekend.

 

Once a week transfer to tape of the latest snapshot of each source.

These tapes are stored in another part of the building in a fireproof safe designed for magnetic media. (A fireproof safe for documents is not good enough.)

The tapes are resused after two months. However:

Once a month one set of tapes are moved off-site and stored at least one year before the tapes are resused. The exception is the April and October tapes which are stored for ten years.

Link to comment
Share on other sites

This is what I've found:

1) Make sure you have at least 4 backups of each system in the set - we had it set to groom at 2 but there are issues with that, espcially with constant grooming and possible data lose.

2) We have over 100 machines spread over 9 backup sets (we backup every two weeks via proactive backup). The sets rang from 5+ to 10M files and 1.2 to 2TB in size.

3) Catalog file rebuilds take a long time (18-24 hours on 1.4+ TB backup sets)..yea, it's a pain.

 

So...to your questions...

When wanting to reduce the size of a Backp set, we transfer the backup set to another location then delete and recreate the original and transfer back....after grooming off data.

Also - be careful when adding another location to a backup set.

We were told that by doing so you lock out the first group from being able to be adjusted...that was a big issue for us so we went to single locations for backup sets.

 

I'm sure folks with more knowledge will be able to clearly answer the questions. I just wanted to throw a couple of things out there as we had issues with grooming and set size.

 

Note - now that we reset things...the backup sets appear more stable.

Link to comment
Share on other sites

followup/clarification by the original poster.

 

first, thanks for the ideas shared so far.

 

clarification: i don't currently own a tape drive and would really prefer not to go that route. as further explanation on this point, i have a large, relatively slow storage unit (a drobopro actually) that i can throw disks into as needed - essentially pooled storage with raid-like redundancy to protect against drive failure. somewhat easier to manage though generally much slower (than "conventional" raid). i also backup a smaller set of "critical" data offsite.

 

clarification: i'm trying to use retrospect's "groom to retrospect defined policy" option, which sounded like it would accomplish what i wanted in terms of keeping around files for as long as possible, subject to a constraint on the total size of the media set. from reading only the short description in the user guide, i perceive a significant difference between this option and the "save last n backups" option. save n copies states clearly what it will do; you have a large storage set, you run groom and at the end you are left with at most n most recent backups of each source - pretty striaghtforward. the wording in the docs about "retrospect defined policy" on the other hand is more suggestive of a priority ordering for deleting files (somewhat like what TimeMachine does) rather than a crisp description of the post-groom state.

 

a side observation: because of the policy i described above, it should not be surprising that these mediasets that i'm using are rather old - they date back several years (created under retrospect 8.x). i wonder if some of the behavior/problems i'm observing are caused by them not being in the latest/greatest format. i wonder: might things improve if i created new mediasets and do copy backup operations to carry the data over?

 

some additional details: i currently keep two separate backup sets, in case of some kind of corruption or something befalls one. they both go back several years. they are around 2.5TB in size each, and between 10M and 15M files each. i gather from comments that those numbers are pretty high.

 

also, i'm not running on a high-powered machine currently (core2duo processor and 6GB of RAM, although it is entirely dedicated to retrospect).

 

i realize that there are other rotation-type strategies that a) will work and B) with appropriate choices of parameters probably be deemed "good enough" for almost any reasonable purpose. i also realize there are good reasons why many organizations would not opt for a strategy like what i describe. however, for my needs, this seems like a very simple and straightforward approach. rather than arbitrarily throwing away backup data older than a certain age and continually creating "fresh" backups (incurring the extra work of re-copying previously backed up data over and over), i simply want to keep incrementally adding to a set until i've used up available space, then start throwing away the oldest stuff. for what its worth, this model is similar to the model apple provides with time machine.

 

that said, i know that every real-world application has its practical limits, whether explicitly documented or merely shared through best practices discussions like this one. prior to this discussion i honestly had no idea whatsoever whether 10M files in one backup set would be considered a small, medium, or astronomically large value.

 

what i do observe, in summary:

with grooming enabled, catalog files are very large.

incremental backups that used to take a few minutes now take a few hours, most not spent in the copying stage (mainly preparing/matching).

grooming is "questionable" at best

 

i haven't yet observed whether grooming will actually work properly or perform "reasonably". up to now, i've mostly avoided the problem by enlarging media sets as they started to approach capacity. the few tests i've done by manually invoking the groom function on media sets have not been conclusive but do cast some doubt.

 

in a nutshell, i am wondering if the above observations "warning signs" that i've gone outside the practical operating limits of the program.

Link to comment
Share on other sites

a side observation: because of the policy i described above, it should not be surprising that these mediasets that i'm using are rather old - they date back several years (created under retrospect 8.x). i wonder if some of the behavior/problems i'm observing are caused by them not being in the latest/greatest format. i wonder: might things improve if i created new mediasets and do copy backup operations to carry the data over?

 

>the above may be the case. I recall reading where older sets in newer software had issue. You may want to research that on the forum.<

 

also, i'm not running on a high-powered machine currently (core2duo processor and 6GB of RAM, although it is entirely dedicated to retrospect).

>I know that the catalog file can go to 8+ times when it's expanded for the backup to occur(if you are set up as such). We had to load a bunch of ram to assist with this. Once done it's much more stable<

 

>as for the groom by perscibed setting - we do that on servers and such(with another product), but not desktops<

 

Again, I'm mainly sharing what we've found in our system.

 

Hmmmm, did you ever think of archiving the whole Drobo and getting another...and start new backup sets?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...