Jump to content

RP6 recovery from "Bad Backup Set" catalog problem


awnews

Recommended Posts

Will Retrospect Pro 6 automatically recover from a bad/corrupted catalog (at the next backup run) or do I have to repair it manually?

 

I recently had a WinXP Pro application popup report a problem writing to a Retrospect backup file (PC --> other PC --> USB 2.0 drive). I've reported this error before, and it seems to be mentioned on the MS website as a known issue with XP under heavy loads.

 

The OS Event Viewer reported the errors:

 

{Delayed Write Failed} Windows was unable to save all the data for the file \Device\LanmanRedirector. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.

 

Application popup: Windows - Delayed Write Failed : Windows was unable to save all the data for the file \\AWPC8\u$\Backups\AWPC15\AW15_to_UDrive.rbf. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.

 

while RP6 was blissfully unaware that anything had gone wrong since it has already finished writing out the catalog.

 

+ Normal backup using AW15_to_UDrive at 4/22/2003 2:30 AM

To Backup Set AW15_to_UDrive...

 

- 4/22/2003 2:30:10 AM: Copying DRIVE_C (C:)

File "C:\Documents and Settings\andrew\Local Settings\Temp\Perflib_Perfdata_844.dat": can't read, error -1020 (sharing violation)

File "C:\Documents and Settings\andrew\Local Settings\Temp\ClipMate6Temp\18560.DAT": can't read, error -1020 (sharing violation)

File "C:\Documents and Settings\andrew\Local Settings\Temp\ClipMate6Temp\18560.IDX": can't read, error -1020 (sharing violation)

4/22/2003 2:41:55 AM: Snapshot stored, 57.2 MB

4/22/2003 2:42:06 AM: Comparing DRIVE_C (C:)

4/22/2003 2:45:10 AM: 3 execution errors

Completed: 2808 files, 96.4 MB, with 71% compression

Performance: 26.7 MB/minute (23.3 copy, 31.6 compare)

Duration: 00:14:59 (00:07:47 idle/loading/preparing)

 

So I ran a check (Recatalog) on the catalog and, sure enough, a "Bad Backup Set header" was reported.

 

+ Executing Recatalog at 4/22/2003 6:32 AM

To Backup Set AW15_to_UDrive...

Bad Backup Set header found (0x64685654 at 3,482,961)

4/22/2003 6:35:52 AM: 1 execution errors

Duration: 00:03:02

 

So I ran a repair on the catalog. BTW, repair message is very unclear since it again reports the "Bad Backup Set header" but doesn't actually say anywhere that it fixed it.

 

+ Executing Recatalog at 4/22/2003 6:36 AM

To Backup Set AW15_to_UDrive...

Bad Backup Set header found (0x64685654 at 3,482,961)

4/22/2003 6:50:04 AM: 1 execution errors

Completed: 64653 files, 3.4 GB

Performance: 258.7 MB/minute

Duration: 00:13:12

 

After this, I again ran a check, and RP6 didn't report any errors with the set.

 

+ Executing Recatalog at 4/22/2003 7:06 AM

To Backup Set AW15_to_UDrive...

4/22/2003 7:06:15 AM: Execution completed successfully

Duration: 00:00:05

Quit at 4/22/2003 7:06 AM

 

 

But I was wondering. If I had done nothing, would the next backup have run without errors (assuming the LanMan situation didn't happen again). That is, would the next backup + catalog have been OK, or once a "Bad Backup Set header" occurs does it damage the backup set to the point that it's unusable for all future backups until a Repair is run?

Link to comment
Share on other sites

A bad backup set header indicates Retrospect encountered a missing or damaged file header, which contains information such as the file’s name and size.

 

If you get these errors during a catalog rebuild, you may also see an error: resync or resynchronizing (slow).

 

1) If you have an encrypted backup set and fail to give Retrospect the correct password prior to a catalog rebuild, you may get these errors immediately. Provide the correct password before the catalog rebuild.

 

2) If the error happens at the same file positions (see the numbers after the errors in the log) every time you rebuild, that is an indication of bad data on the media. If the data was written incorrectly to begin with, a catalog rebuild will never fix the problem. If you backed up with verification turned off, then you may see these errors during the catalog rebuild.

 

3) If the file positions change with each rebuild attempt, then follow the troubleshooting steps outlined below and attempt the catalog rebuild again.

 

4) Try the catalog rebuild with the media in a different drive, if the error happens when rebuilding the media in different drives, then you may have a damaged tape.

 

Troubleshooting:

 

1) Try a new tape or disk

2) Try a different brand of media

3) Try isolating the device when possible. If it is SCSI, try making it the only external SCSI device.

4) If it is a SCSI device under Mac OS X, make sure it is using a supported SCSI card.

5) Look for updated adapter drivers and firmware: USB, FireWire, SCSI or ATAPI vendors may offer newer firmware or drivers which may make for better device communication.

6) Try a new cable, especially for a SCSI device. Internal ATAPI cables could become crimped or damaged. Inspect them if necessary.

7) Try making your ATAPI device "primary" rather then a "slave".

8) If you are using USB, try connecting the device directly to the computer rather then to a USB hub.

9) Make sure the files reporting the error are not in use. In rare cases, an open file will generate this error at the same media position number each time.

10) Try the device on another computer.

 

If you do all of the above, and the error follows the drive to another computer, that could be an indication of a problem with the device. Check with the hardware vendor to see if they can offer any additional suggestions.

 

This error usually happens during copy/compare because the data on the backup media is not identical to the data on the source hard disk. Basically the data on the backup media is corrupted and can not be restored. Try the above listed troubleshooting for resolution.

 

You do not need to rebuild the backup set catalog when you see these errors. You should, however, attempt to backup the data again. If the problems are persistent, follow advised troubleshooting.

 

Link to comment
Share on other sites

Amy,

 

You're heading off into the weeds. The problem I encountered ("Application popup: Windows - Delayed Write Failed" and LanmanRedirector) is a known problem with Windows XP SP1. Several notes in the Microsoft knowledge base mention it, e.g.:

 

http://support.microsoft.com/default.aspx?scid=kb;en-us;812937

 

This problem occurs when the redirector flushes the contents of the file, and writes to a file handle with read-only access instead of to a file handle with write access. When the redirector received an opportunistic lock break to none, it purged the cache for the file, but did not uninitialize the cache for the file. The redirector also needed to purge and uninitialize when the set end of file occurs because the opportunistic lock break is asynchronous. Because it did not uninitialize the cache for the file, it wrote to the incorrect file handle.

 

and something related is a known issue in Windows 2000

 

http://support.microsoft.com/default.aspx?scid=kb;EN-US;296264

 

 

So Retrospect is not the cause of the problem, but neither is bad media, a bad source file, a bad network, need for new drivers, etc. Perhaps Retrospect could handle something a bit better (e.g. double checking that the file is properly closed and not corrupt instead of believing the last "got it" from the OS) but it sounds like MS has a possible fix and is working on it for a future service pack. Note that all "Write cache enabled" options (under Windows Device Manager/Disk Drives) have been disable on all drives on all PCs.

 

*Today's* question is whether, after the "Bad Backup Set header" error occurs, if Retrospect will be able to produce a good incremental backup to that same backup set (e.g. if the scheduled backup is run nightly) or if the error is permanent in a way that keeps the backup set from being used. Your final statement "You do not need to rebuild the backup set catalog when you see these errors. You should, however, attempt to backup the data again" is vague enough that it doesn't answer this specific question. If I leave the backup set alone (and that backup can't be restored). It's a given that that backup session is hosed and can't be used for restore. But:

 

1) are all the previous backups in the set OK for restore?

 

2) and will the next night's incremental backup be OK for backup and restore (assuming the same "Delayed Write Failed" doesn't occur)?

Link to comment
Share on other sites

A bad backup set header indicates Retrospect encountered a missing or damaged file header, which contains information such as the file’s name and size. - So, no you should not rely on the data to be restorable.

 

If the next nights backup does not produce bad backup set headers, you should be in good shape. However, to be on the safe side, verify the media through Tools > Verify. If any members have bad backup set headers, you may want to consider marking them as missing and allow Retrospect to backup that data again.

Link to comment
Share on other sites

  • 3 weeks later...

I've noticed that RP6.5 is much better at reporting the Catalog errors (i.e. Delayed Write Failed & Bad Backup Set header found messages) I described with 6.0. Even though they're sometimes still occurring to my USB drive (I need to fiddle with the MS Registry settings, and hopefully MS will address this in a future SP), at least RP6.5 reports the error (6.0 didn't flag this and the catalog could be corrupted with no warning from RP). At least with RP6.5 I know right away that something is wrong and can address it immediately.

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...