Jump to content
robvil

Grooming and catalog files - there´s a bug

Recommended Posts

Hi,

 

I am 100% sure there´s a bug in Retrospect, because each and every time I groom a backup-set I am getting corrupted catalog files. I have tried every possible combination, groom only 1 backup-set, groom many backup-set, groom small backup-set, groom large backup-set, groom when nothing else is running, place catalog files in different volumes than the backup-sets ect.

 

Off cause a catalog file recreate fix the problem, but then you have to check all your scripted backupjobs to make sure the recreated backup-set is added again.

 

This has been going on through the last "many" releases. I like the idea to pay for 1 year software update and support which I do, but it seems there´s only been a few updates mostly containing added support for new hardware. Do we ever get a real software update becides new hardware support?

 

Further the support is so slow to respond and I have never got a answer to the above problem yet.

 

Regards

Robert Vilhelmsen

Share this post


Link to post
Share on other sites

Can you provide us with some more specific details?

 

Have you followed the recommendations and guidelines outlined in this KB document:

 

http://kb.dantz.com/article.asp?article=9629&p=2

 

How much free space is on your C drive?

 

What operating system is this?

 

How big is the backup set? How many sessions? How many files?

 

What grooming option have you specifically selected?

 

Do you groom when the volume fills or do you groom on a schedule?

 

How often do you groom?

 

Is your backup set contained on one disk or does it span to multiple members?

 

What errors are reported when you try to groom?

 

Any errors during the catalog rebuild?

 

Is grooming perfect? No it isn't. We know this and we will make changes to improve stability, but for the vast majority of users grooming works well.

Share this post


Link to post
Share on other sites

I have been over this with Andrew Anderson some time back and he could not find anything wrong with my setup.

 

 

Can you provide us with some more specific details?

Grooming Backup Set Backup01-Server failed, error -2241 (Catalog File invalid/damaged)

 

Have you followed the recommendations and guidelines outlined in this KB document:

I will read it later today.

 

http://kb.dantz.com/article.asp?article=9629&p=2

 

How much free space is on your C drive?

all between 60 to 500Gb. Plenty.

 

What operating system is this?

Windows 2003 SP2 32bit std.

 

How big is the backup set? How many sessions? How many files?

Here´s one backupset: 53Gb 146 sessions 27.745 files. Another: 549Gb 190 sessions 151.270 files

 

What grooming option have you specifically selected?

Groom to remove backups older than X (all between 5 and 10 days)

 

Do you groom when the volume fills or do you groom on a schedule?

I groom with a schedule. Volume has never been filed up.

 

How often do you groom?

Tried different values. Now it´s every 14 days.

 

Is your backup set contained on one disk or does it span to multiple members?

Only one disk.

 

What errors are reported when you try to groom?

mostly Grooming Backup Set Backup01-Server failed, error -2241 (Catalog File invalid/damaged)

 

Any errors during the catalog rebuild?

Rebuild never work. Only a recreate works.

 

Is grooming perfect? No it isn't. We know this and we will make changes to improve stability, but for the vast majority of users grooming works well.

 

But if grooming do not work 100% stable forever incremental for worth nothing. And how can we trust that data is really avaibly if the backupset catalog file gets corrupted.

 

The last time I got this I was not able to recreate the catalog file as it said: Device trouble: "1-DMZ-Databaservere", error -105 (unexpected end of data) during recreate. I had to delete the backupset and catalog file and create it from scratch.

Share this post


Link to post
Share on other sites

Quote:


 

The last time I got this I was not able to recreate the catalog file as it said: Device trouble: "1-DMZ-Databaservere", error -105 (unexpected end of data) during recreate. I had to delete the backupset and catalog file and create it from scratch.

 


 

This is the key to all your problems.

 

Grooming itself can not cause a -105 error. This is caused by a problem with the backup storage device.

 

Also, a failed groom can not cause a catalog rebuild to fail.

 

If a catalog rebuild is failing (tools>Repair>Recreate from disks>all disks), it isn't because grooming failed it is usually because of a disk issue or a corrupt RDB file.

 

Grooming uses temp files and does not modify rdb data files until after that individual data file successfully grooms.

 

If you can't do a successful catalog rebuild, then grooming also won't be successful.

 

If you can solve the -105 errors/catalog rebuild failures then you can probably fix your grooming issues.

 

I would suggest a long format of the backup storage device. Try a different cable connecting the storage device to your computer. Maybe even try a different backup drive. If you have already changed the backup drive, you may want to try a different backup computer with your storage device.

Share this post


Link to post
Share on other sites

hhmm and what if I say my backup device is a RAID5 volume? It´s here the problems is.

 

It´s always when I transfer the backupsets to tape Retrospect complains about error -2241 (Catalog File invalid/damaged). That´s where I noticed it.

 

Robert

Share this post


Link to post
Share on other sites

A backup Set transfer will not usually report the -2241 error. We usually see that on the groom failure. Can you copy and paste the log entry for the failed backup set transfer?

 

How is the RAID connected to the computer? USB/iSCSI/Fiber channel/NAS/Firewire?

Share this post


Link to post
Share on other sites

A backup Set transfer will not usually report the -2241 error. We usually see that on the groom failure. Can you copy and paste the log entry for the failed backup set transfer?

 

Sorry my mistake - it´s of cause after grooming af backset the above error comes.

 

How is the RAID connected to the computer? USB/iSCSI/Fiber channel/NAS/Firewire?

Internal in server with 4 WD 5000YS Sata disks connected to a Promise fasttrack SX4.

Share this post


Link to post
Share on other sites

for the promise fasttrack sx4 ensure that disk caching feature is turned OFF (it's faster with it off too, funnily enough), otherwise data corruption does occur, which explains your symptoms. This corruption is unrelated to retrospect. We ordered 5 identical HP servers all with this raid controller and kept getting filesystem corruption, missing registry hives and all manner of grief until we turned the disk cache OFF using the promise management GUI and drilling down to the appropriate option. No problems since for many years, 3 of these servers still running fine. (unfortunately just using ntbackup ) :rollie:

Edited by Guest

Share this post


Link to post
Share on other sites

Hi

 

Disk caching is OFF and has been for a very long time.

 

I have found the reason for the problem. My catalog files is placed on a seperate volume and this is how I avoid the problem:

 

1. Same day before the grooming has to run I run a defragmentation on the catalog file volume.

2. After grooming is done I do a defrag on the volume again.

 

If I follow this procedure I do not get corrupted catalog files.

 

Then my question is:

Is this replated to a poor raid controller not handling fragmented volumes good or Retrospect handling fragmented volumes bad?

 

Robert

 

Share this post


Link to post
Share on other sites

Hi Robert,

 

If the disk is severely fragmented its possible that it can cause problems, however retrospect is not causing the fragmentation.

 

 

Edited by Guest

Share this post


Link to post
Share on other sites

Despite the fact that this thread is 5 years old, the multiserver v 8.1 for Windows that I have still gets the catalog corruption after grooming in exactly the same manner as described above. My catalog files are not on the backup set volume and a Windows defrag is scheduled weekly on the C drive where the catalogs are sitting - so I do not think fragmentation is the issue. Still, catalog files are hosed for backup sets with amazing regularity, especially when the backup sets get to the ~300 GB size range or larger. In fact, it is not so much the size of the backup set that matters as it is the size of the catalog file itself. The sets that I have the most trouble with are also the ones with the largest catalog files; the largest one has 1 GB+ file size. Even small amounts of backed up data (e.g. 45 GB) gives trouble if the file count is so high (I have a bazillion 500 byte data files) that the catalog file easily gets to the 400 MB or larger size range. This behaviour almost suggests that Retrospect is running out of memory or just not getting itself enough of it to write large catalog files that are valid after grooming. The catalog files often still point to rdb files that have been removed in the grooming. SO subsequent attempts to groom or even verify fail miserably. So then the catalog file has to be rebuilt yet again. This is an issue not only with the Windows version but with the Mac version as well.

Share this post


Link to post
Share on other sites

We had this same problem and finally in frustration we stopped grooming at all. Instead i created a dummy backup set for temporary storage, and then created a script to copy the last backup to the temp storage, erase the original backup set, and then copy the data back to the original backup set, and then erasing the temp storage. This has worked flawlessly now for over 4 months.

  • Like 1

Share this post


Link to post
Share on other sites

I see groom failures constantly on my backup server, and my catalog files are on a single, non-RAIDed SSD disk. My backup sets are on a ReadyNAS PRO 6TB device, which has zero issues for sharing files to a fair amount of staff over the same SMB protocol.

Share this post


Link to post
Share on other sites

Despite the fact that this thread is 5 years old, the multiserver v 8.1 for Windows that I have still gets the catalog corruption after grooming in exactly the same manner as described above. My catalog files are not on the backup set volume and a Windows defrag is scheduled weekly on the C drive where the catalogs are sitting - so I do not think fragmentation is the issue. Still, catalog files are hosed for backup sets with amazing regularity, especially when the backup sets get to the ~300 GB size range or larger. In fact, it is not so much the size of the backup set that matters as it is the size of the catalog file itself. The sets that I have the most trouble with are also the ones with the largest catalog files; the largest one has 1 GB+ file size. Even small amounts of backed up data (e.g. 45 GB) gives trouble if the file count is so high (I have a bazillion 500 byte data files) that the catalog file easily gets to the 400 MB or larger size range. This behaviour almost suggests that Retrospect is running out of memory or just not getting itself enough of it to write large catalog files that are valid after grooming. The catalog files often still point to rdb files that have been removed in the grooming. SO subsequent attempts to groom or even verify fail miserably. So then the catalog file has to be rebuilt yet again. This is an issue not only with the Windows version but with the Mac version as well.

 

You are not alone with this.

It's amazingly frustrating.

...and you seem to be on to something with the memory issue.

We can only hope, I guess, that they update the code to reflect the latest build software....but hey, they're still making money with the old shtuff! ;)

 

Share this post


Link to post
Share on other sites

I have to say that I don't think grooming works either.  I've tried the various voodoo that has been recommended (new disk drives, controllers, cables, magic pixie dust) and nothing worked consistently.  I've switched my backup policy to recycle backups every three weeks -- which means that we have back up our entire server every week (three backups by three weeks), but it has worked reliably.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×