Fulco Posted January 11, 2010 Report Share Posted January 11, 2010 Grooming misery For some years now Grooming “Disk Backup Sets†starts damaging the Backup Set. First a new “Disk Backup Set†is created (on a newly formatted, NTFS (NOT compressed) hard disk). After the Backup Set reaches its limit, errors start to appear. These range from “.. you must recreate the Backup Set’s Catalog†to Assert errors. Sometimes a “Catalog repair†solves the issue. One Backup Set keeps on failing: + Executing Server Groom (HD2) at 1/9/2010 9:50 PM (Execution unit 1) Grooming Backup Set Server (HD2)... Backup Set format inconsistency (10 at 1105958796) Grooming Backup Set Server (HD2) failed, error -2242 (Catalog File duplicated or ambiguous) You must recreate the Backup Set's Catalog File. See the Retrospect User's Guide or online help for details on recreating Catalog Files. Can't compress Catalog File for Backup Set Server (HD2), error -1 (unknown) 1/9/2010 10:12:13 PM: 3 execution errors Duration: 00:20:51 (00:00:35 idle/loading/preparing) The line: “Can't compress Catalog File for Backup Set Server (HD2), error -1 (unknown)†is really ‘stupid’ due to the fact that the catalog isn’t compressed at all. Every Recreate Catalog, followed by a Groom, gives the same result. Also there is a ‘Server (HD2).rbc.log’ file I followed the advice in “Grooming Tips and Troubleshooting (9629)†(1) >30Gb on C: (system) (2) Catalog files are stored on separate disk (D: Not disks containing .rdb files) (3) C: (system) and D: (Catalog files) are continually defragmented. I personally (??) suspect the defragmenting of the disks containing the .rdb file, as a cause for the damaging of “Disk Backup Setsâ€. (4) ??â€Don’t groom too oftenâ€?? Guy’s we are talking about computers, not about eating ice cream. (5) 3, 4 or “Defined Policy†(7) Done often (8) 32 Gb RAM (9) 10% of disk-space for disks containing .rdb files, C and D more than 50% free space (10,11) all disks are in server These errors make Retrospect’s Disk2Disk backup unreliable. Are these errors corrected in Retrospect 7.7? Any similar findings? Any tips? Fulco Quote Link to comment Share on other sites More sharing options...
rcohen Posted January 11, 2010 Report Share Posted January 11, 2010 I have found grooming to be very intolerant of read/write failures. More so than backups. In the past, I have had problems due to simultanious FTPs or copies of backup results to another drive, and iSCSI on a non-segregated network. Defragging and anti-virus could also potential interfere with read/write operations. Also, before the 64-bit version, running out of memory could corrupt a groom file. Once I segregated iSCSI and scheduled copy & FTP jobs to not happening during grooming, they have been very reliable. I run them once a week. When a groom does fail, you need to rebuild a catalog file. Also, starting with 7.7, I have had occasional errors when running simultanious jobs, so I had to drop to one execution unit. If things are still acting up, try recycling the backup set and starting from scratch. Of course, you lose your backed up data that way, unless you have another copy. Quote Link to comment Share on other sites More sharing options...
rhwalker Posted January 11, 2010 Report Share Posted January 11, 2010 These errors make Retrospect’s Disk2Disk backup unreliable. Are these errors corrected in Retrospect 7.7? Here is the list of bug fixes for Retrospect 7.7: Retrospect 7.7 Release Notes Quote Link to comment Share on other sites More sharing options...
Fulco Posted January 11, 2010 Author Report Share Posted January 11, 2010 Retrospect 7.7: 19582: Retrospect Defined Grooming Policy does not keep most recent backup for each week or month So no other Grooming bugs fixed. Fulco Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 12, 2010 Report Share Posted January 12, 2010 Fulco, 'Strangely' we do groom our sets but it almost always works correctly. Nowadays most of our backup sets are located on iSCSI storage, but some are on local storage. The only problem that sometimes surfaces is the problem Retrospect can't always groom out everything it needs. In effect the storage seems to grow until it can't groom out enough to make a difference. In such a case we recycle the backup. We always have an A and B set for this kind of backup, so it is not a problem to recycle. Your problem seems something different though. Is it possible for you to test with another server altogether? Actually I'm leaning towards a problem with your hardware. In the past we have had systems operating correctly, but with retrospect they were less stable due to the massive I/O generated. Swapping memory solved that problem. At this date we haven't switched to 7.7 due to reliability issues with that version. So I can't really say if EMC improved grooming. But I agree they probably didn't do much with it. Ultimately you shouldn't see those grooming errors. So it must be something else. Either hardware or indeed things like antivirus. To troubleshoot that effectively, you will need extra hardware I'm afraid... Quote Link to comment Share on other sites More sharing options...
Fulco Posted January 12, 2010 Author Report Share Posted January 12, 2010 One ‘hardware’ related problem I found, is the interaction of Grooming and Adaptec’s Power Management (APM). APM is a feature of Adaptec’s RAID controllers to power down hard disks (Volumes) not used/accessed for a set amount of time. Normally this doesn’t interfere with the Backup. But I found Grooming can spend a long time (Matching), without accessing the disk containing the .rdb files. Possible this error, could be due to this: + Executing Hamlet Groom (HD1) at 11/18/2009 10:00 PM (Execution unit 1) Grooming Backup Set Server (HD1)... Groomed zero KB from Backup Set Server (HD1). Grooming Backup Set Server (HD1) failed, error -1101 (file/directory not found) You must recreate the Backup Set's Catalog File. See the Retrospect User's Guide or online help for details on recreating Catalog Files. Can't compress Catalog File for Backup Set Server (HD1), error -1 (unknown) 11/18/2009 10:01:13 PM: 2 execution errors Duration: 00:01:06 Fulco Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 14, 2010 Report Share Posted January 14, 2010 Fulco, I presume you can switch APM off and try again? Quote Link to comment Share on other sites More sharing options...
Fulco Posted January 14, 2010 Author Report Share Posted January 14, 2010 Yes, I tested Grooming with APM switched off. The errors (-1101) I mentioned in my last post disappeared. Still appearing are: Backup Set format inconsistency (10 at 1105958796) Grooming Backup Set Server (HD2) failed, error -2242 (Catalog File duplicated or ambiguous) Can't compress Catalog File for Backup Set Server (HD2), error -1 (unknown) Fulco Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 14, 2010 Report Share Posted January 14, 2010 Okay, maybe you have/had two simultaneous problems... Or your backup set was corrupted due to the 'APM problem'. Did you rebuild the catalog beforehand? Quote Link to comment Share on other sites More sharing options...
Fulco Posted January 14, 2010 Author Report Share Posted January 14, 2010 I rebuild the catalog several times. Recycling the backup set will be the only solution (I think). Fulco Quote Link to comment Share on other sites More sharing options...
robvil Posted January 14, 2010 Report Share Posted January 14, 2010 I believe this is a bug in retrospect. I have backup set / catalog issues periodicly... sometimes a catalog rebuild solves it and other times I have to recycle the backupset. During the years I have replaced hardware/software/drivers and nothing solves the problem. Even if the disk subsystem has heavy I/O retrospect shall not crash the data. And why blame other stuff as it´s only Retrospect that has issues on our hardware? Regards Robert Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 14, 2010 Report Share Posted January 14, 2010 Robert, are you sure this is a bug? We do not see this error on our Retrospect servers. We very rarely have backup set/catalogue issues. I'm not saying you are wrong, but on the other hand you might not be right as well. Besides that, I believe some extra testing might resolve this problem for Fulco. If you have the same problem as Fulco it might be interesting to check what you have in common (hard & software wise). In the end it is a fact Retrospect can tax your system's I/O pretty good. So you might even see hardware error related issues you otherwise wouldn't notice. Quote Link to comment Share on other sites More sharing options...
robvil Posted January 14, 2010 Report Share Posted January 14, 2010 I am 100% sure it´s not a hardware issue. On the same hardware previously I had a SQL DB running doing way more I/O and memory usage than retrospect does. Never had issues there. And again even if the underlaying disksystem is doing heavy I/O it should not trash data. For me it looks as a timing issues when retrospect grooms data and Retrospect does not handle this correct. Regards Robert Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 14, 2010 Report Share Posted January 14, 2010 It is perfectly possible MS SQL isn't taxing I/O as much as Retrospect does. Not all data is written or read in the same fashion. In my experience Retrospect grooming can use more system resources than SQL does. I agree this should not happen, but it can, and does. Remember in the end it is the OS that does the writing and reading. Not Retrospect. Taxing a system to the max for a sustained interval can lead to errors, regardless if Retrospect is involved. But it's also perfectly possible this problem is Retrospect related, however I'm at this time not convinced Fulco's particular problem is 100% a Retrospect problem and the same problem that you have. But I'm not shooting you down, after all you might be right. I still think you and Fulco should compare notes. What do your setups have in common? Quote Link to comment Share on other sites More sharing options...
robvil Posted January 14, 2010 Report Share Posted January 14, 2010 You might be right that it´s not a retrospect problem, but I suspect it to be a retrospect issue... I have had unexpected end of data during backup where the only solution is to recycle the backupset and I cannot figure out why this happens. This happens approx. 2 times a year and I´m not running out of diskspace or backupset space. Btw. my sql do generate way more I/O than retrospect does. Running multiply Firebird sql´s with 200 concurrent users on the largest DB doing havy inserts, updates, read, joins ect. 12 hours strait each day. I have approx. 400 users using 40 DB´s each day 365 days a year and never had issues like than on the same hardware. Regards Robert Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 15, 2010 Report Share Posted January 15, 2010 The only problem we have with grooming is some backup sets fill up and can't be groomed out. There are probably limits to what can be groomed out and thus the data space in the storage set grows until it gets to the set limit. We have at least seven MS SQL servers running (not counting development machines). Some have thousands of concurrent users. I'm not familiar with Firebird though. I'm sure SQL can tax machines a lot. But there is also a lot af caching involved. Retrospect, while grooming, has a tremendous amount of disk I/O. And the way they do that is probably different from the way SQL works. In other words not all is equal. I've had 16 core machines with 48GB RAM become temporary unresponsive during a groom. MS SQL doesn't behave like that. It's quite 'intelligent' compared to Retrospect I think. I do remember a couple of years back, when grooming was introduced, there were many grooming errors resulting in backup set corruption. However this is pretty much solved by patches and updates nowadays. But indeed it illustrates the problem can be Retrospect related. Due to the fact there are so many variables involved (kind of hardware, drivers, software installed, etc) it is not easy to troubleshoot this kind of problem. But if you are seeing the same errors, you and Fulco might have something in common, which might be very useful information for EMC. Quote Link to comment Share on other sites More sharing options...
Fulco Posted January 28, 2010 Author Report Share Posted January 28, 2010 Here it is again, with another Disk Backup Set: + Executing Server Groom (HD3) at 1/28/2010 4:14 PM (Execution unit 1) Grooming Backup Set Server (HD3)... Backup Set format inconsistency (10 at 690087248) Grooming Backup Set Server (HD3) failed, error -2242 (Catalog File duplicated or ambiguous) You must recreate the Backup Set's Catalog File. See the Retrospect User's Guide or online help for details on recreating Catalog Files. Can't compress Catalog File for Backup Set Server (HD3), error -1 (unknown) 1/28/2010 4:16:38 PM: 3 execution errors Duration: 00:01:25 (00:00:31 idle/loading/preparing) A few weeks ago Disk Backup Set HD2 gave the same errors. How do I get rid of these errors? Recatalog followed by Groom, gave the same result (just like the last time with HD2). Could the Disk Defragmentation software have something to do with this? Fulco Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 28, 2010 Report Share Posted January 28, 2010 Could the Disk Defragmentation software have something to do with this? Hmm, nasty... I presume, by your remark, you have some disk defrag tool running? It might be the culprit. Can't you switch it off and try again? Maybe start with a new backup set? For Retrospect storage I personally think defragmentation is not really needed. Quote Link to comment Share on other sites More sharing options...
Fulco Posted January 28, 2010 Author Report Share Posted January 28, 2010 No, the Defragment program is not running during Groom and ReCatalog. The disk containing the Backup Set is Defragmented one's a week. This normally NOT happens during a Backup. However I can’t rule out the Defragmenting taking place during a Groom action automatically started by a Backup Set reaching its capacity. But Defragmenting stops automatically when the disk has a lot of IO. No, I suspect moving files around (defragmenting) breaks the Backup Set. Our 3 Disk Backups sets started showing these errors after a Defragmentation program was installed. This could be just a coincidence. Fulco Quote Link to comment Share on other sites More sharing options...
Ramon88 Posted January 28, 2010 Report Share Posted January 28, 2010 It depends a bit. Some defrag tools can dig deep into the system and it's not always known if they are 100% reliable in every working condition. Grooming in the past didn't always work when they introduced it. But nowadays I find it very reliable (7.6.123). I can't really imagine not having it anymore! However, to remove disk defragmentation from the equation, would it be a real problem to remove that software from your setup? It's a real nasty problem to troubleshoot... Quote Link to comment Share on other sites More sharing options...
Fulco Posted February 10, 2011 Author Report Share Posted February 10, 2011 Grooming is sill failing. Every time a disk backup set, reaches it capacity, retrospect stops or crashes. When will Roxio fix this? Quote Link to comment Share on other sites More sharing options...
Lennart_T Posted February 10, 2011 Report Share Posted February 10, 2011 Grooming is sill failing. Every time a disk backup set, reaches it capacity, retrospect stops or crashes. When will Roxio fix this? Do you have the catalog file on the same volume? That is not recommended. We run groom scripts every weekend, to make sure the backup set never reaches maximum capacity. How would we, your fellow users, know what Roxio will do? Quote Link to comment Share on other sites More sharing options...
Fulco Posted February 10, 2011 Author Report Share Posted February 10, 2011 I follow kb Article # 9629 to the letter: (1) C drive has 50% free space (>100Gb) (2) Catalog files are stored on D, also > 50% free space (100Gb) (3) Disk based backup set (.rdb files), are stored on separate hard disk, with always 10% free disk space All disks are regularly defragmented, but only when Retrospect is NOT running (5) 2 snapshots used (7) rebuild takes hours! This must be done manually. When grooming fails, Retrospect hangs: waiting for new space (backup set) (8) Server has 32 Gb memory Quote Link to comment Share on other sites More sharing options...
Lennart_T Posted February 11, 2011 Report Share Posted February 11, 2011 I follow kb Article # 9629 to the letter: Well, in tip 4 it says: "If you want to make sure the disk never fills, create a grooming script (Automate>Manage Scripts>New) to run once a week." Since you are having problems when the disk fills, I would schedule a groom script once a week. Quote Link to comment Share on other sites More sharing options...
Fulco Posted February 11, 2011 Author Report Share Posted February 11, 2011 I will give it a try. However: some of the Backup Sets reside on disks that have 50% free space (900Gb). These sets also suffer from the same issue! And it also says: don’t groom too often. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.