Jump to content
twickland

Serious bug causes data to be written to the wrong tape

Recommended Posts

With Retrospect 8.2, I noted a serious bug where, when a tape was ejected from an HP LTO tape drive using the drive's eject button, Retrospect Engine would sometimes not register that the drive was empty, and would continue to believe that the ejected tape member was still in the drive. Then, when a different tape member was inserted, Retrospect Engine would not register the header information for the new tape and would continue to believe that the previously-ejected tape member was still in the drive.

 

I'm sorry to report that the problem still exists in Retro 10.2, even with a different tape drive and connection type (the current drive is an HP StorageWorks Ultrium 3000 SAS connected via an ATTO ExpressSAS H644 HBA).

 

When this bug occurs, Retrospect will write the actual data to one media set while recording the catalog metadata to a different media set's catalog. Not good.

 

Since I first experienced this bug, I have been careful to always check that Retrospect is displaying the correct member in the drive whenever I swap tapes. In Retrospect 9 and 10 I had not noticed the recurrence of this bug until yesterday when—wouldn't you know it—I failed to confirm the tape member and Retrospect performed several proactive backups to the wrong tape. Luckily, I noticed the error this morning when the regularly-scheduled backup failed to run last night because Retrospect thought that the previous tape was still in the drive, so the repair and rebuild of the two catalogs won't be too onerous.

 

 

  • Like 1

Share this post


Link to post
Share on other sites

Very sorry that this bug is still around. We've tried repeatedly to reproduce it ourselves but no luck. I know we've suggested in the past that it could be a device miscommunication issue, which would explain why it's so rare, but Retrospect should still recognize the second tape. We'll continue to look into it.

Share this post


Link to post
Share on other sites

We tried to reproduce this issue again, using different hardware and dozens of tests, but we still don't see it. If you have the time, it would be great if you could increase the devices logging level to include SCSI logging using "SetDevicesLogging=6" in the retro.ini file at /Library/Application Support/Retrospect. You'll need to restart the engine for the log change to take effect. We'll be able to see how the device is responding and why Retrospect misses the eject information.

Share this post


Link to post
Share on other sites

The full error, where a tape is ejected, Retrospect fails to register that the drive is empty, and then fails to register the header of the new tape, has been quite rare in my experience too. It's probably happened only 3 or 4 times in the past three years, beginning with Retro 8.2 (the first post-6.1 version I used). This would be roughly once per 75–100 tape ejections. 

 

There have been a few more times where Retrospect didn't register that the drive was empty but did register the new tape correctly. At the time, I attributed this effect to the typical delays in communication between the engine and console.

 

I've never been able to reproduce either the partial or full error by simply inserting and ejecting a tape (which, of course, never results in the tape's advancing beyond the beginning). I can't say whether this null result has been due to the rarity of the error, or because the error requires that the tape be rewound from a fairly advanced position for it to occur.

 

I've increased the logging level, but will need to see whether the log becomes so cluttered with device command information that it prevents me from viewing the data I really want to see. By the way, is the sense data below typical of what's to be expected?

 

  SCSI 0:0:0: 9:29:38 AM Command <00 00 00 00 00 00>

  SCSI 0:0:0: (<x> 0 passed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 00 00

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 1b 29

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 1c b9

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 1e 49

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 1f d9

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 21 69

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 22 f9

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 24 89

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 26 19

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 27 a9

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 c1 8f

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 c3 1f

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 07 00 80 c4 af

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 3a 00 00 00 94 50

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 02 00 00 00 00 10 00 00 00 00 04 01 00 00 94 46

  SCSI 0:0:0: (<x> 0 failed more times)

  Sense > 70 00 06 00 00 00 00 10 00 00 00 00 28 00 00 00 94 09

  SCSI 0:0:0: (<x> 0 failed more times)

  (but this time it passed)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×