Jump to content

Backup fails during media change with error 515


Recommended Posts

I'm running Retrospect Desktop version 6.0.193 on a G4 tower, attempting to back up a iBook client running version 6.0.108. Both machines are running MacOS version 10.3.4. The backup set is being written to DVD+RW discs using a Pioneer DVR-107D burner. The computers are connected via ethernet via a switch.

 

The backup runs fine until the first disc fills up, and Retrospect prompt for the second. It successfully erases the new disc, and as soon as it tries to talk to the client again, the backup terminates with the following error:

 

Trouble reading files, error 515 (Piton protocol violation)

 

I have attempted this backup 3 times, and every time it has failed exactly at the media change. It therefore seems very unlikely that this is triggered by a real network problem. More likely this is a Retrospect bug, possibly involving the server failing to correctly save or resume the state of the piton protocol when a media change is necessary.

 

Looking back in the archives, it looks like this same problem has been reported many times before, with every Mac OS X version of Retrospect and several versions of the operating system. It looks as if a solution has never been reached. Any chance of a resolution this time around?

Link to comment
Share on other sites

 

Something I meant to add the first time: I have completed many, many backups to a file on a hard drive using the same network configuration. In fact, a recycle backup to a file just completed just fine. The piton protocol violation only occurs with the DVD+RW media change.

Link to comment
Share on other sites

  • 3 weeks later...

We've been receiving the dreaded 515 Piton error as well, using a Quantum DLT autoloader. Is anyone at Dantz (or anyone at all) willing to shed any light on this thing for us? We tried eliminating any network complexities; it sure seems like a Retrospect bug to us. Some help on this would certainly be appreciated.

Link to comment
Share on other sites

Hi

 

If you are backing up PC clients make sure they are running the 6.0 version of Retrospect client. There is a know issue where this will happen with the 6.5 client.

 

Are you backing up Mac clients or PC? Do you have the latest Driver update for Mac installed?

 

Thanks

Nate

Link to comment
Share on other sites

We have no PC clients. Just one Mac OS X server and a couple of Llinux clients. Using the latest drivers for each of these. BTW, we haven't seen the erorr for over a week, but I would like to know the casue in case it happens again (and I'm sure it will).

Link to comment
Share on other sites

For my part, as the originator of this thread, I have now had a chance to install the latest Macintosh RDU, version 5.6.102. The exact same problem is still occuring: a piton protocol violation that coincides exactly with a media change, every time. Client and server are both Macs in my case, as before. Both server and client are now running MacOS 10.3.5.

Link to comment
Share on other sites

  • 2 weeks later...

The Piton error shows up only for a Mac OS X (10.2.8) server client volume. The linux clients back up fine. This is also coinciding with a new media backup. Rebooting the client machine and re-running the backup returns the same error. All of the clients ar ein the same network segment, running off of the same switch. From everything I've seen, the problem is more likely due to something client-side than anything having to do with the network. I'm finding this issue _very_ frustrating. Suggestions anyone? Dantz? This is recently installed retro 6 client and application. New media backups seem to trigger this error. -pw

Link to comment
Share on other sites

I'm not sure whether you were responding to me or not, but we have uninstalled and reinstalled the client. It's pretty clear to us that, no matter what Dantz says, this piton 515 error is not due to any network issues here. It is almost certainly client-related. And last night the client (Mac OS X Server 10.2.8) just seemed to die on its own - it wasn't visible to Retrospect at backup time. The only solution is to kill the process and restart it. We're also fairly certain that there is nothing we can do or change that will provide a permanent fix for this; we're going to have to wait for Dantz to step forward to acknowledge the problem and do something about it. In the meantime, we're going to see whether we can get at least one decent backup from this client this week! Needless to say, our confidence in Dants and Retrospect is really taking a nosedive...

Link to comment
Share on other sites

I have also tried all of the standard reinstallation and purging of preference tricks. And I agree with pweil as far as frustration level. This is very clearly a bug, and searching through past discussion on these boards make it clear that this bug has been there for a very, very long time. Is there any point at which persistent problems like this are brought to the attention of the developers of Retrospect for OS X? The conditions under which this bug is triggered are crystal-clear: it ought not be very hard to track down.

Link to comment
Share on other sites

We had another backup failure last night. The OS X Server client in question actually got about 3/4 of the way through the backup before the dreaded 515 piton error killed the session. It completed about 48 of 57 GB, but then the session ended incomplete with the message:

 

"trouble reading files: error 515 (Piton protcol violation)"

 

We have been unable to back up this file server since last week! No config or system-related changes have been made to the client since then.

 

I've just turned pitond logging on to level 9. I may try another backup this morning and will post the logging results. In the meantime, I am open to more suggestions on how to solve this. I'll bet my bottom dollar that this is a client-related issue, and not a networking issue per se. Wisdom, anyone?

 

-pw

Link to comment
Share on other sites

Quote:

I'll bet my bottom dollar that this is a client-related issue, and not a networking issue per se

 


 

Are you a gambler?

 

It's impossible to test whether Duke will win the NCAA, but it _is_ possible to run at least some tests against the possibillity of this being network hardware related.

 

- Did this failure happen across the media change?

 

While you say that you "tried eliminating any network complexities" which immediately makes me suspect that a network complexity causing the problem (user reports are like that for me).

 

If you can take the client machine off your greater LAN, I'd connect the two machines directly with a different ethernet cable, set them each to addresses on the same subnet, forget and re-add the client, and test a backup that spans across the DVD

 

If time is critical, you can reduce the testing time by first filling up most of the DVD with individual sessions of known size before going ahead with the larger test (which doesn't have to be the whole drive, but can be a sub-volume large enough to span; that will reduce the initial scan time).

 

I know you're hoping for Dantz to step in here and tell you that it's all their fault. But it's possible that it's not a Retrospect/Client bug (so how could they say that it is?). It's also possible that it _is_ a bug, but the Forum is not how they're gong to hear about it. You should open a support incident with tech support. If it turns out to be a software defect they're real fair about refunding any fee you spend.

 

The cost of a tech support call must be less then your bottom dollar...

 

Dave

Link to comment
Share on other sites

I'd rather not discuss the paid tech support issue here. You're entitled to your opinion.

 

As for our problem, this client is on the same segment as our other clients as well as the backup machine, all connected to the same 10/100 switch. There are no 'network complexities' here. The other clients exhibit no problems at all. Everything points to soem kind of client issue.

 

What is happening is that this OS X Server client gets 50-55GB of data backed up each night before Retrospect bails with the 515 piton error. There is a total of approx. 200GB to back up to a DLT autoloader. It doesn't appear to stop on any particular file; it's the consistent amount of around 50Gb that seems notable and puzzling.

 

An email from my assistant also adds the following notes:

 

"...current ruleset- (format is chain position, allow/deny ipd/udp, from

ip/mask to ip/mask).

65535 allow ip from any to any. So... the firewall isn't in the way at all....

What is happening is that the connection -is refused- by the client.

I've already verified that by ssh'ing into .16 and telneting into

retrospect's ports, and failing. ..."

 

-pw

Link to comment
Share on other sites

pw,

 

This thread was started with the title "Backup fails during media change..." but your reports do not include that.

 

- Can you confirm that you're not seeing the connection fail during a Media Request?

 

- You earlier noted that it was coinciding with a new media backup; is the current failure to connect happening on a New Media execution?

 

There are two unix processes running during client backups; pitond (which runs all the time) and retropds.22 (which launches when the client begins to provide files to the application machine and quits when it's finished).

 

- If you watch the client's processes during a backup, does retropds.22 die at the time of failure?

Link to comment
Share on other sites

1. No - initially it seemed to coincide with a new media backup (not quite the same as 'media change' as described above. But it has occurred since in the ways I've already explained.

 

2. We don't get media requests becasue we use an autoloader. It (normally) already has access to all fo the DLT tapes it needs.

 

3. It occurred during our last two New Media backups - late-July and late-August. But also during normal backups since the last New Media backup.

 

4. I see a retropds.23 running during the backup right now. I'll have to wait for it to fail before I can answer your question. But it just might make it all the way through tonight since it only has 10GB left to backup, as opposed to 200GB. I'll update this thread when the backup either gets completed or dies - shouldn't be long.

 

Quote:

pw,

 

This thread was started with the title "Backup fails during media change..." but your reports do not include that.

 

- Can you confirm that you're not seeing the connection fail during a Media Request?

 

- You earlier noted that it was coinciding with a new media backup; is the current failure to connect happening on a New Media execution?

 

There are two unix processes running during client backups; pitond (which runs all the time) and retropds.22 (which launches when the client begins to provide files to the application machine and quits when it's finished).

 

- If you watch the client's processes during a backup, does retropds.22 die at the time of failure?

 


Link to comment
Share on other sites

 

Okay, let me explain once more why I think my problem cannot possibly be a network problem. I back up this client regularly to a hard disk on my server using exactly the same network configuration and exactly the same selectors. The machine has never failed to back up to the hard disk file. The machine only fails to back up when I attempt to back up to DVR+RWs instead. That is, I am attempting to back up exactly the same files using the

same network. The solitary difference between the two backup configurations is the target media. I have backed up this client to the hard drive file hundreds of times, and have never seen this problem. I only see the piton protocol violation when the backup needs to write to a second DVR+RW disc. If you have an explanation for how my network could work flawlessly backing up precisely the same data to a hard drive, but somehow manages to fail consistently if and only if it the same data set spans more than one DVD, I'd really like to hear it.

 

Now, if it is really the case that Dantz no longer has a way to report bugs that does not require paying extra for technical support, they're about to lose a customer whose been using their products since version 1.0.

Link to comment
Share on other sites

The backup completed tonight without any failures, so I couldn't check on how retropds.23 behaves during a failure. We'll probably have to wait until the next time we have 50+ gigs to backup on this client -- most likelythe next New Media backup a month from now.....unless we have time or a chance to do a test.

Link to comment
Share on other sites

Quote:

If you have an explanation for how my network could work flawlessly backing up precisely the same data to a hard drive, but somehow manages to fail consistently if and only if it the same data set spans more than one DVD, I'd really like to hear it.

 


 

If I had even the slightest understanding of the networking issues involved in this sort of communications I might venture a theory. But I don't.

 

But what I do have is a grasp of basic testing techniques. And there are some easy tests that could be done that _might_ allow you to learn more about the cause.

 

If you were to connect the Retrospect machine to the client machine using only a single ethernet cable, and the backup spanned to the second DVD with no error, you would have proven that there _is_ a network issue.

 

Now, if the test does cause the error you won't have proven that it's _not_ a network issue, as there are other network hardware components involved (such as the physical interface on each machine). You would have to test each of these, one at a time, before you could rule them out as being involved. Remember, you're the one who prefers to do this alone and on the cheap.

 

You can't prove the existence of a software defect without doing thorough testing; and you can't expect Dantz to take your online report as proof of a bug without documenting your tests, or by coming up with steps to reproduce that fail on a clean test bed.

 

Remember, unless Dantz already knows that this is a bug, they're only going to know if it is or not is if they can reproduce it.

 

>if it is really the case that Dantz no longer has a way to report bugs that does

>not require paying extra for technical support, they're about to lose a customer

>whose been using their products since version 1.0.

 

Well, you could use the web form at:

http://www.dantz.com/index.php3?SCREEN=product_feedback

and report the "unexpected behavior." But don't expect a reply or acknowledgement. And you won't get a chance to work directly with a tech support person to try and solve the problem.

Link to comment
Share on other sites

 

Well I am a systems engineer, and I do have a pretty thorough grasp of the networking issues involved. But you don't need to know anything about the underlying network traffic in order to grasp my argument. It seems to me that you are showing that you do not, in fact, have a grasp of basic testing techniques.

 

What we have here is a simple logical formula, with many variables involved. If a run my backup with a hard disk file as the target, this backup succeeds every time. If I change only one variable, that the backup set is now writing to a DVD rather than a file, leaving every single other variable involved unchanged, the backup fails every time. Moreover, it fails at precisely the same point in the backup process. My test gives you more information that the test you are suggesting. In fact, this is the strongest possible evidence that the failure must be due to the variable I changed. There is really no other plausible explanation.

Link to comment
Share on other sites

Absent other information, the tests and observations you've made do lead to your current working theory.

 

But you _do_ have other information, provided by the software developer, regarding the known causes of the error you're seeing. Dantz suggests that 'piton protocol violations' can be caused by network hardware issues.

 

Put simply, if you are able to construct a configuration where the problem is not present you can learn a great deal.

 

I have no idea if this problem will show for you if you test with a cross-over cable; it very well may be a defect in any of the multiple software programs involved. But without attempting to prove a null-hypothesis, you're clinging to plausibility as your proof.

 

Dave

Link to comment
Share on other sites

 

Okay, one more try before I give up here entirely. Here's an analogy for you: I have a car that won't start. I try swapping the battery out, and the car starts. I swap the old battery back in, and the car doesn't start. Repeat as many times as you want. Conclusion: a bad battery was causing the problem. I don't need to try changing out the battery cables, too, just because bad battery cables can also sometimes stop a car from starting. I already have a cause, and I'm confident that my car will continue to run as long as I use the new battery, not the old one. Any further testing is a waste of my time.

Link to comment
Share on other sites

Just to add some more info about our situation: after reviewing the logs, we discovered that what has been happening to us during and after our new media backup this past week is nearly identical to what occurred during our previous new media backup one month ago. The pattern is this: each night, the Mac OS X Server client backs up approx. 50 of 200 Gb to our Quantum AutoLoader using 40/80 DLT tapes, then the backup dies because of a 515 piton error. Our two other Linux clients back up with no problem. When fewer than 50 Gb remains to be backed up, there is no failure or piton error. I have no idea why the number 50 Gb would have significance, but this is the pattern. We had only one exception to this pattern last month when one night the cient backed up 90Gb with no piton error.

 

We're going to try a test of doing a special >50Gb backup with at least two tapes and see what happens. -pw

Link to comment
Share on other sites

 

Given the raw capacity of your autoloader, if you're getting about 25% compression, that 50GB wall you're running into could be due to your autoloader changing tapes, which would make your problem identical to mine. Does your autoloader have the ability to log media changes, or else can someone supervise the autoloader when your backup is about to hit your 50GB limit, and see whether the piton error happens right after a tape swap?

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...