Jump to content

Suddenly getting tons of "miscompare at data offset" errors


Recommended Posts

Last night I used my October 2005 Powerbook to backup about 60 GB of data on a Windows XP machine over an Ethernet connection. I placed the backup on a 500 GB LaCie Big Disk external drive over a Firewire 800 connection. I got 176 "miscompare at data offset" errors. I've never encountered this problem before. In fact, I backed up this same 60 GB of data a few days ago to a 320 GB Hammer external drive using a USB 2.0 connection and had no errors at all. Can someone tell me what's going on here and whether these errors mean my backup is useless for those files where there was a miscompare error?

 

I also reviewed my Retrospect log files and saw that I've also experienced an inordinate number of these "miscompare" errors when using the LaCie drive to backup my Powerbook as well. So the problem doesn't seem to be limited to backups of the Windows machine.

 

Thanks for the assistance.

 

Robert

Link to comment
Share on other sites

Miscompare at data offset... errors are always a concern, because they mean that the source volume and destination backup set are not identical. While they can be benign (such as when a file on the source volume is modified by a running application between the time of backup and compare), given your symptoms, it sounds like trouble with the backup set.

 

I'd suspect a problem with the LaCie drive, or possibly with your FireWire adapter or cable. Try running a disk utility such as DiskWarrior, though a clean bill of health doesn't necessarily exonerate the drive. (Retrospect stresses the drive more than a lot of other software and will therefore sometimes flags drive problems earlier.) We had problems with Retrospect and our LaCie 500 drive that LaCie blamed on Retrospect, until a couple of weeks later when the drive spontaneously began reporting itself as having two volumes, one 250 GB and the other 2TB.

 

As for FireWire, the cable is easy to check by swapping, though it's probably least likely to be the source of trouble. If you have access to another computer, you could try backing up using its FW port as a point of comparison. Also, some users were reporting FW issues with 10.4.7. If you recently updated, you might want to try reverting the system to an earlier version and see if that makes a difference.

Link to comment
Share on other sites

Quote:

Miscompare at data offset... errors are always a concern, ...

 


 

Twickland, I respectfully and vigorously disagree. If, for example, you are backing up a live server, it can simply indicate that data is changing while the backup is going on. The same can happen if you are backing up a networked client while it is being used. And, because Unix (Mac OS X) is a modern operating system, there are always things going on in the background (syslogd, etc.) that cause log files, ec., to change. This is normal and does not necessarily indicate any hardware problems at all. The only thing that the error message means is that the file backed up on the destination is not the same as the source file now on the disk at the time of the subsequent verify pass.

 

With the present design of Retrospect Mac (backup pass followed by verify comparison pass), it takes judgment to look at each of the log error entries to decide whether there is a problem. An alternative way to handle this would be to compute a polynomial checksum (MD5, whatever) for each backed-up file and then, rather than doing a verify comparison with the file on disk, re-compute the polynomial checksum for the backed-up file when read from the backup medium, compare with the recorded polynomial checksum saved when the backup was made. This is how, for example, BRU (Retrospect's competitor, from Tolis Group) does it, and I wish Retrospect handled things this same way because it would give you confidence that the files were backed up correctly even if the source file on disk has changed in the interim. It even allows you to re-validate the integrity of the backup at any time in the future, even if the original source file is unavailable. I've made a feature request a few times, about a year or so ago.

 

Note that this does NOT solve the problem of inconsistent database backups on a live server (discussed extensively elsewhere in these forums), and still makes it necessary to shut some services down temporarily during the backup. That's a completely different issue.

 

Regards,

 

Russ

Link to comment
Share on other sites

Quote:

With the present design of Retrospect Mac (backup pass followed by verify comparison pass), it takes judgment to look at each of the log error entries to decide whether there is a problem.

 


Russ:

 

Actually, by virtue of what you say above, I'm not so sure that we're in substantial disagreement. By "concern," I meant to say that one needs to pay attention to these error messages to determine whether they are significant, and not just blow them off.

 

Admittedly, I was speaking in shorthand in an attempt to get to what I perceived as macwino's issue, and so did not get into the kind of detail that you have helpfully provided here.

Link to comment
Share on other sites

As far as my issue is concerned, the files for which there were miscompare errors are permanent files that do not ever change. So I guess I need to be concerned about the errors and I need to redo the backups and check out the variables affecting the FW 800 connection.

 

Thanks for the assistance.

 

Robert

Link to comment
Share on other sites

OK guys, I did a lot of testing and the results aren't very encouraging.

 

First, I redid the 60 GB backup I first posted about to the external LaCie Big Disk using a USB 2.0 connection instead of FW 800. I got 1/3 as many "miscompare" errors as before (62 v. 195), but this is still an unacceptable amount. And again, these are not files that change; they are static. Also, the files that miscompare are different from one backup to the other. So this test ought to eliminate my FW800 port and cable as the source of the problem.

 

Second, I confirmed that the backup of these same files to an external Hammer drive using USB 2.0 had no "miscompare" errors at all. So this suggests that perhaps the problem is with the La Cie Big Disk.

 

Third, to check this out I did a backup of a different set of files from the Windows machine to both the La Cie and Hammer drives using USB 2.0 with both drives. With this different set of files, I got about 50-60 "miscompare" errors in both cases, all involving files that are static and do not change. So it seems that with this set of files it makes no difference which drive is being used. I still get what I consider to be an inordinate number of "miscompare" errors.

 

Needless to say, in these circumstances, Retrospect isn't providing me with a backup that I can rely upon. The only variables that I can think of that I've not checked are my Ethernet cables and my router. Might these be the source of the problem, or could it be something else?

 

Any thoughts?

 

Thanks, Robert

Link to comment
Share on other sites

Hmmm. Have you tried swapping the power supply brick on the LaCie Big Disk? Is the brick hot?

A quick google shows that a number of people are having problems with the LaCie Big Disk.

web page

 

Now I (and perhaps you) know that lots of people like to get on the internet and rant, and not everything on the internet is factual (sorry to let you in on that secret).

 

But it just might be that the drive is marginal, and that it's not Retrospect's fault. Retrospect does tend to stress hardware pretty well. What happens if you do a big dd copy/ccompare to the drive?

 

i.e., if you are willing to trash the drive, put 10 GB of random data on it, then replicate, compare:

 

dd if=/dev/random of=/LaCie/junk bs=1m count=10k

dd if=/LaCie/junk of=/LaCie/junk2 bs=1m

cmp -l /LaCie/junk /LaCie/junk2

 

You can even get an insight into whether it's write errors or read errors by putting the scratch (junk) file on a known good drive, then do repeated compares of the known good drive against a single written instance on the LaCie of the same file. If it passes for a while and then starts giving errors on successive read passes, well, it's probably a read problem.

 

Let us know. If success, increase the count, lather, repeat, rinse. Sounds like there are issues and that twickland was right on the money. Sigh, sometimes I am too verbose. Been told that before.

 

Russ

Link to comment
Share on other sites

I guess I wasn't as clear as I might have been in my last post. What I found was that for one set of backup data there was a substantial difference in the number of miscompare errors between the LaCie Big Disk and the Hammer drive. (About 200 v. 0.) This suggested to me that the LaCie drive might be at fault.

 

However, for another set of backup data it made no difference whether I used the LaCie Big Disk or the Hammer drive; both backups were riddled with a substantial and unacceptable number of miscompare errors. This suggested that the LaCie drive was not the problem, to me at least, and that something else was at fault. Dare I say Retrospect?

 

Thanks for the input.

 

Robert

Link to comment
Share on other sites

Well, there's always the possibility of marginal memory. Can't really tell with anything other than an Xserve, because only the Xserve has ECC and can spot single (correctable) and double (non-correctable) errors.

 

I've never seen miscompare errors by Retrospect on our Xserve (or, before its retirement, our old ASIP server) that weren't caused by files changing between backup and compare. I trust Retrospect's messages on this point. Note, however, that we've never done file/disk backup, only backup to tape, so it's not testing the same code that you are using.

 

Russ

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...