Jump to content
Sign in to follow this  
cmcfarling

Grooming fails to start in some cases

Recommended Posts

I've been posting to another thread some issues I've had with Retrospect Multiserver 7.5.387.

 

http://forums.dantz.com/ubbthreads/showflat.php/Cat/0/Number/104567/an/0/page/

 

While troubleshooting those problems I've come accross a couple other ones so I started this thread.

 

On page 257 of the UG the following is stated under the Snapshots Tab section:

"When you forget a snapshot from a disk backup set with grooming enabled, Retrospect deletes the selected snapshot and it's associated files."

 

page 258:

"NOTE: Snapshots with the lock icon are protected from grooming and cannot be groomed until they are unlocked. They can be forgotten."

 

"WARNING: Once you click forget, Retrospect *will* groom the snapshot and it's associated files"

 

So let's say I open the Snapshots tab of a backup set. I then highlight a single locked snapshot and click Forget. A confirmation dialog box pops up and I confirm that I want to forget the snapshot. I then close the backup set window and the following warning message is displayed:

 

"The snapshot(s) you forgot will be groomed from the backup set to reclaim disk space, when do you want to groom?" [Now] [Later]

 

Wait a minute, didn't to UG say that locked snapshots won't be groomed? Well I guess it also said that once you click forget, Retrospect *will* groom the snapshot and it's associated files. It would seem that if a locked snapshot was forgotten there should be no message coming up claiming that it's going to be groomed. I guess that message just comes up no matter what. I think Retrospect should be smart enough to know if it's really supposed to groom or if it's supposed to just forget the snapshot. So back to the message. Since my only options are Now or Later I click Now. At that point nothing happens, which is really what I wanted anyway, I didn't want it to actually groom anything.

 

So now let's say I do want to groom a snapshot using this method. Back in the snapshot tab I highlight a snapshot, this time one that is not locked. I click Forget, the confirmation box comes up and I confirm. I close the backup set window and again see

 

"The snapshot(s) you forgot will be groomed from the backup set to reclaim disk space, when do you want to groom?" [Now] [Later]

 

I click Now but this time nothing happens either. There is no grooming operation initiated, although this time I did want a groom operation to occurr. No Event or History entry either btw.

 

So basically this whole mechanism for kicking off a grooming operation from the snapshots tab has both some UI problems and some funtional problems. On top of that the documentation could probably stand to be a little clearer.

 

On a similar note, one is supposed to be able to initiate a groom operation from the option tab by changing the grooming policy parameters. For example, say the grooming policy is set to Restrospect defined policy. If I change that to a user defined number of snapshots and then close the backup set window, the follow message is displayed:

"Grooming preference has changed. Retrospect will retrieve older snapshots to make sure they do not get groomed out"

 

By hiting OK you would think that something would show up in the executing tab showing the progress of the snapshot retrieval. And then maybe an entry in the History and/or the Events tab notifying of failure or success. However none of that happens. Retrospect never does what it said it was going to do. I've seen the same behavior if the number of saved snapshots is changed on the Options tab. For example, say I change the config from storing 10 snapshots to 12 snapshots. Again the "Grooming preference has changed..." window and then nothing happens. It simply just doesn't work.

Share this post


Link to post
Share on other sites

And yet another instance of this behavior...

 

On a backup set I changed the "Groom to remove backups older than" setting to 70 from 35. As noted above the "Grooming preference has changed. Retrospect will retrieve older snapshots to make sure they do not get groomed out" dialog box came up. After clicking OK nothing happend.

 

Hmmm, maybe forcing a groom at this point will cause those older snapshots to be retrieved. From the Options tab I clicked the Action button. From the list of potential actions I selected Groom and clicked OK. Now in the past I have done this very thing and it worked. What's supposed to happen is the backup set properties window automatically closes at this point and the groom action shows up in the Executing tab. However, that is not working now. Retrospect just ignores my request like it never happened.

 

Ok, let's go to plan C. I'll create a groom script and force it to groom that way. So I go to Manage Scripts and edit my Groom script that has already been set up for this type of thing. I set it so that there is just a single source, the backup set that I'm attempting to groom, and click OK. From the Run menu I select Groom and click Execute. Finally I'm going to force Retrospect to groom this backup set. Oh wait a minute, not so fast. The groom operation goes to the Waiting queue with a status of "Waiting for Production1" (my backup set name in this case). Why is it waiting???? Production1 is not in use by any other operation. Unfortunately I have seen this behavior before too. This will sit in the Waiting queue indefinitely. One before when this happened and I let it go to see if it would eventually execute. It did not. It happily sat there for a couple days before I cancelled it.

 

In fact this has happened enough for me to develop a workaround. There is a Proactive backup setup in Rerospect too. Note that the Production1 backup set has nothing to do with the Proactive backup. For whatever reason, stopping the Proactive backup has, in the past, triggered the waiting groom operation to start. (that makes perfectly good sense doesn't it?) I bet that if I stop the Proactive backup, the waiting groom operation will execute. Let's try...

 

Hmmm, I guess I was wrong the Production1 groom operation is still waiting. However once I stopped the Proactive backup a grooming operation for a backup set named Operations1 started. Where in the hel* did that come from????

 

This is completely messed up. Instead of moving toward a set it and forget approach of backing up, Retrospect seems to be going toward the set it, hope it works, reset it, hope some more, then keep resetting it every so often approach.

 

Maybe EMC will give me a free upgrade to v8 for beta testing this software?

Share this post


Link to post
Share on other sites

Update...

 

Grooming of Operations1 completed successfully after 40 minutes. The waiting groom operation of Production1 didn't budge though, it's still in the waiting queue. I thought I'd stop the Proactive backup again to see if that would trigger it. Well it did trigger a groom of Production1 in fact. However it wasn't the groom operation that was sitting in the waiting queue, that's still there. In the meantime Production1 is being groomed though. The only thing I can think of is that the operation that's executing now stems from one of the times that I tried to initiate a groom from the Action window and nothing happened. As if Retrospect remembered that I wanted to do that and just decided that now was the time to start it because the proactive backup had been stopped. To be honest I can't remember if I had actually tried to manually groom this backup set though. It's hard to keep track of all of the things that have gone wrong.

Share this post


Link to post
Share on other sites

Update...

 

The Production1 groom operation finished after 2h 23m at 1/7/2008 8:18pm. As of 1/8/2008 7:43am the Production1 groom operation that was stuck in the waiting queue is still there. Let's try stopping the proactive backup again and see what happens...

 

Amazing, after stopping the proactive backup, a groom operation for yet another backup set, Production3, has started executing. Again, where did this mysteriously come from? And of course the Production1 groom operation in the waiting queue is still there.

 

It just gets better and better (or actually worse and worse).

Share this post


Link to post
Share on other sites

Update...

 

The Production3 groom operation finished after 16m at 1/8/2008 8:00am. As of 1/8/2008 9:47am the Production1 groom operation that was stuck in the waiting queue is still there. Let's try stopping the proactive backup yet again and see what happens...

 

Finally!! The Production1 groom operation moved from the waiting queue to the executing queue.

 

I don't know what else to say that hasn't been said. There are some serious issues with this software to the point where managing groom enabled disk backup sets is almost undoable.

Share this post


Link to post
Share on other sites

If you recall from post #105225 - 01/07/08 01:37 PM above, I had changed the snapshot count from 35 to 70 for the Production1 backup set. That's when Retrospect told me it was going to retrieve older snapshots so they wouldn't get groomed out, which of course it did not do. Then this whole progression of trying to force Production1 to groom started with the intent that maybe Retrospect would retrieve those older snapshots as part of the groom process.

 

Well... that didn't work out so good. The backup set went from having 83 snapshots total to just 40 now. In other words, after changing the snapshot count from 35 to 70, the backup set contained 35 active snapshots and 48 inactive snapshots. Retrospect was supposed to retrieve 35 of the 48 inactive snapshots to make the total number of active snapshots 70. After doing so a groom operation would have groomed out the remaining 13 snapshots. Instead those 35 (+ the 13) were never retrieved and were groomed out leaving just the original 35 active snapshots. At this point 5 new snapshots have been created for a total of 40.

 

So not only does grooming fail to start sometimes, you will potentially lose data unintentionally if you can get it to work.

 

Sorry EMC but it may be time to look at other options.

Share this post


Link to post
Share on other sites

It is very unusual to have anyone saving 70 snapshots for EVERY hard disk you backup.

 

As an example. If you backup a C drive and a D drive:

 

The backup set will contain 70 Snapshots for drive C and 70 Snapshots for drive D. If your backup disk only has enough free space for 69 snapshots of drive C and 69 snapshots of drive D, then the grooming operation will report "Groomed Zero MB".

 

If you backup Drive C 71 times and Drive D 71 times, then the unique data that exists in the oldest snapshot will be groomed out. Grooming will only happen when the disk actually fills or if you schedule a grooming operation.

 

You mention that Retrospect may not have retrieved all of the snapshots after you made a setting change. Are the "inactive" snapshots all IDENTICALLY named from the same source volume?

 

In my experience, keeping even 35 snapshots is a big number and most users find that grooming works best when you keep less snapshots, so a reasonable amount of data can get groomed at the point your disk gets full.

Share this post


Link to post
Share on other sites

Quote:

It is very unusual to have anyone saving 70 snapshots for EVERY hard disk you backup

 


Why? I've had a similar setup for years when backing up to tape. The source volumes are publicly shared volumes. Users on production desktops work directly from these network volumes. Since files are constantly changing, being deleted, added, etc throughout the day, backing up the sources once a day is not adequate for an effective recovery policy. For these production volumes, I'm backing them up 5 times a day at approx 5 hour intervals. By keeping 70 snapshots per volume that allows at least 2 weeks of backed up data. On the other hand, I also have backup sets which backup just once per day and are set to keep 21 snapshots (3 weeks).

 

Here's an overview of my setup

 

BACKUP SET SCHEDULE SNAPSHOTS TO RETAIN SIZE LIMIT

----------------------------------------------------------------------------------

Production1 Sun-Sat 1AM,6AM,11AM,4PM,9PM 70 1800G

Production2 Sun-Sat 3AM,8AM,1PM,6PM,11PM 70 3000G

Production3 Sun-Sat 5AM,10AM,3PM,8PM,2AM 70 1500G

Operations1 Sun-Sat 8PM 21 100G

Operations2 Sun-Sat 8PM 21 200G

Databases1 Sun-Sat 12AM,12PM 28 100G

 

Quote:

If you backup Drive C 71 times and Drive D 71 times, then the unique data that exists in the oldest snapshot will be groomed out. Grooming will only happen when the disk actually fills or if you schedule a grooming operation.

 


 

Yes I realize that

 

Quote:

You mention that Retrospect may not have retrieved all of the snapshots after you made a setting change. Are the "inactive" snapshots all IDENTICALLY named from the same source volume?

 

 


 

Yes, they are identically named. And, I didn't mention that Retrospect may not have retrieved all of the snapshots, I mentioned that it didn't retrieve any snapshots.

 

What about all of the other issues with Retrospect NOT doing something it is supposed to do as I've oulined in this thread? I can't help but get the feeling that no one at EMC is going to take 2 seconds to try to reproduce any of these issues. Am I wrong?

 

As I've been beta testing, er um, trobleshooting over the last month or so, the recurring theme from EMC tech support is that all of the problems are with my configuration, my hardware, my backup strategy, etc, etc, etc. If I am pushing Retrospect beyond its limits then how would I know? Where is the documentation stating what the limits are???

Share this post


Link to post
Share on other sites

Grooming does not have a hard limit I can document. Every environment is going to be different. If I could provide you with hard limits, I would do so.

 

I see you had questions about locking snapshots. Locking a snapshot means that this locked snapshot will not be automatically groomed while it is in the snapshot list. If you delete the snapshot (even if it was locked), Retrospect assumes you don't need the snapshot anymore and you would like it to be groomed out.

Share this post


Link to post
Share on other sites

If you read through this thread I'm not asking how things are supposed to work. I know how things are supposed to work. I'm asking why things don't work as they are supposed to.

 

It seems to me that for anyone trying to implement a backup system beyond something fairly basic, Retrospect tends to fall on it's face, or at least require a lot of babysitting. There are several threads on this forum that support that statement I think. That's not the kind of backup software most admins want to deal with. By posting my findings here, calling tech spport on the phone, emailing you directly, I was hoping to spark some interest from the support and/or development departments to try to make this software better. I don't know, maybe some of this has filtered through. From my perspective though, I get the impression that support doesn't want to be bothered with these "annoying inquiries by that guy who owns one copy of MultiServer." After all they have better things to do like get v8 out the door, right?

 

Nothing personal, it's just frustrating when you know something should be working better and there is no real recourse to make that happen, especially after investing a great deal of time and money.

Share this post


Link to post
Share on other sites

I have read your different threads and all of your concerns. As the Senior Manager of technical support, I can say that your concerns and feedback about grooming are being heard and given to the appropriate engineers at EMC. I have updated and written bugs based on the grooming behavior you have reported. I do not have an easy solution outside of the best practices KB article you have already read.

 

Do we want to make grooming better? Yes, but it isn't something that will happen overnight. We have a development roadmap and bugs get logged and reviewed during specific segments of the development schedule. We do plan on tweaking and improving grooming for our next major Windows release.

 

It sounds like grooming is turning into an admin problem for you based on the specific issues you have encountered, and you may want to try and implement a strategy that does not involve grooming until we are able to make code changes. Specifically you can schedule a Backup Set Transfer of specific snapshots into a new backup set, and then Recycle the prior backup set to free up disk space. I know this is not ideal, but it may be the best solution in the short term..

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×