Jump to content

Any solution to "Net Retry" problems?


Recommended Posts

So we recently upgraded one of our Retrospect servers from OS 9 to OS 10.2.8. Upon doing so Immediately noticed the server haging in a "net retry" state indefinitely when a client was abruptly removed from the network mid-backup. This happened on a number of different clients running a number of different operating systems. We would often find the server in the morning in this catatonic state having failed to do the previous nights backups.

 

I was hoping this problem could be solved by upgrading to version 6 of the server software so I got an evalution version and began to run it. It did the exact same thing as version 5. I've read through the forums and most people suggest setting the client speed threshold. I tried this and it had zero effect on the situation. So my questions is this. First is Dantz aware of this problem and secondly is there solution.

 

As it stands right now I'm going to have to downgrade back to OS 9. We are trying to remove all OS 9 machines from our computing infrastructure so if I can not find a solution to this problem I'm goint to have to explore other backup software packages. Something I'd prefer not to do as we have been using Retrospect for many years.

 

Any help in this matter is greatly appreciated.

 

Thanks,

 

Mike

Link to comment
Share on other sites

In normal operation, Retrospect does not behave this way. So it'll help if you do some trouble-shooting on your end.

 

First, can you describe your hardware configurations?

- What flavor of Macintosh?

- What is the network topology shared by that machine and the effected clients?

 

That the problem is showing on "... a number of different clients running a number of different operating systems" certainly suggests an issue other then with the client software. But it would be helpful for you to list the specific versions of Retrospect Client software as well as the corresponding client OS on at least some of those machines.

 

- Does the problem happen on all clients, each and every time?

- Have you attempted to connect a problem client to the Retrospect application machine in a different physical manner, such as using a cross-over cable or through a testing LAN?

 

The setting that Dantz added to version 5.1 of Retrospecct is not the client speed threshold script option (which has been in the product since version 4 or earlier), but is the Execution performance threshold preference setting. This was introduced to manage an issue where the clients themselves were misbehaving, causing Retrospect to go into a lost-connection/rerty/re-connect/lost-connectection loop.

 

Note that this preference setting (found in the "secret" preferences window when you Option+click on the Preferences button) will stop a script if a client falls below the set threshold; so it works best when it's used in a Backup Server configuration (where each new connection acts as a new script).

 

Dave

Link to comment
Share on other sites

I've done some troubleshooting but to answer some of your questions.

The server is a G4 867 running 10.2.8. It is on the network backbone,

as are all of our backup servers, 15 in total. We have 3 Retrospect

servers with the same network configuration. Only one is running

OS X, the only one that is experiencing these problems. The clients

are all coming in a wired ethernet connections no differently than

they are for any of the other servers.

 

Specifically I've notices this problem on XP, Win 2k, and OS X machines.

I am able to replicate the error on my local workstation running XP.

My workstation is also on the backbone same as the backup servers.

If I unplug my workstation in the middle of a backup on the troubled server

it will sit there until user intervention is taken. Thus making the server unable

to perform any other backups. It will go the duration of a weekend if you allow

it to. All of the machines that have had troubles have been running the latest

client software and have all necessary security patches install. These are

the exact same clients that backed up flawlessly under OS 9.

 

The problem does happen on all clients and all the time.

 

I have set the client speed threshold script option as well as the execution

performance threshold preference in the secret settings. Neither of these

setting had any affect on the problem.

 

In my testing I've noticed that if you shutdown or disable a network

connection from the client machine the backup stops properly and moves

on as it should. However if you simply unplug the cable in the back of the

machine you enter the "net retry" cycle and the server becomes catatonic.

This is a large problem since most laptop users simply unplug from their

ethernet cable without shutting down their system causing the server to

endlessly look for the machine.

 

Once again this problem is identical on both server version 5 and 6

under OS X.

 

Thanks for your help,

 

Mike

Link to comment
Share on other sites

Quote:

The problem does happen on all clients and all the time.

 


 

This is good news, if only because you'll be able to know when you've identified the problem.

 

Have you bypassed the network backbone for any of your tests? Can you bring a Powerbook to the server room and connecct it to this Retrospect backup machine directly with a single cable?

 

In all my tests and experiences, disconnecting a Retrospect Client while a backup is in progress doesn't hang the server. There was a version that did have a problem if you attempted to stop a backup in the middle of a large file being copied, but that got fixed in a Client update a while ago.

 

As for the execution performance threshold preference in the secret settings, if you enable File System Logging (in the same secret preferences window) you can watch as Retrospect tests the response of the Client system. If you disconnect a Client in progress, what does the log show the program doing during this time?

 

If you can reproduce this issue with two machines and an ethernet cable you've probably identified a bug that Dantz should be able to reproduce, too. And if that's the case, a call to Tech Support (and an upfront charge) should get you a case number (and a refund for your charge if/when you request it).

Link to comment
Share on other sites

  • 3 weeks later...

I came to this forum looking for an answer to the same exact problem that Mike is having. I have many laptops in our group and when someone disconnects the laptop (not performing a proper shutdown) from the network while in the process of a backup...it kills the server by putting it in an infinite "Net Retry" loop. Of course this usually happens on a Friday afternoon and "hangs" the server all weekend instead of allowing it to perform its time-intensive tape backup scripts.

 

I may call Dantz to report it directly to them (since we just upgraded to Server 6)...this is a big problem for us and possibly a deal-breaker for any future use of the software.

 

Frank

Link to comment
Share on other sites

  • 3 weeks later...

I was finally able to find some time to do some more robust troubleshooting. I had called Dantz and reported the problem. The told me to set the execution speed threshold setting. I had explained to them I had done that and it wasn't working and that's why I was calling. After about two hours on they told me to set the threshold higher. I tried explaining to him that it doesn't matter what the threshold is set to because it's being ignored. The person talked with his manager and a few other people and her is the solution I received. "Well if the threshold setting isn't working then there's nothing we can do for you." This wasn't the solution I had hoped for. Even after explaining that this would probably mean the end of my University's use of Retrospect they still didn't care. I was simply stunned at their indifference at losing our business.

 

Having given up on Tech Support I decided to some experimenting of my own. I did a fresh install of 10.3 and Retrospect 6. The threshold was working fine. Then I introduced the Retrospect Event Handler and it stopped working once again going into a "Net Retry" loop as it had in the previous installation. I removed the Event Handler and it worked again. I came to find out that when Retrospect is not the foreground application. As if the Event Handler or even Safari, or any app for that matter, is running in the foreground then the threshold is ignored. This is most certainly a software problem and a big problem for anyone who needs to run the Event Handler as we do here.

 

I am able to reproduce this bug on various hardware so I'm almost completely convinced that this is a bug in the Retrospect software and it will need to be addressed by Dantz. I'm going to try and contact them again via email but juding by their indifferenc to my last request I'm not expecting this to be fixed anytime soon. This saddens me as we will have to explore other backup software. There is some comfort in at least tracking the problem down though. Good luck to all the other being affected by this bug.

 

Thanks for everyones help.

 

Mike

Link to comment
Share on other sites

Quote:

I came to find out that when Retrospect is not the foreground application. As if the Event Handler or even Safari, or any app for that matter, is running in the foreground then the threshold is ignored. This is most certainly a software problem and a big problem for anyone who needs to run the Event Handler as we do here.

 


 

The first step in isolating a software defect is creating an environment where it can be reproduced at will. You appear to have done that on your setup.

 

The next step is documenting the steps involved so another person can test it on a different setup. I tried, and was unable to dupllicate your observations.

 

Here's what I did:

 

- Powerbook G3 Series/10.3.4/Retrospect 6.0.193

- Powerbook G3 Series/10.3.4/Retrospect OS X Client 6.0.108

- Configure "Secret Preferences" Client Threshold to 9000 Megabytes/Minute (unreleasticly high)

- Immediate Backup of defined subvolume on client (5,000+ files/400 Mb)

- Watch backup run for about a minute before error in log:

"The backup client execution performance is too slow (measured 115.5 MB/min, threshold 9000.0 MB/min)"

 

- Repeat above steps, this time putting Safari window in front of Retrospect

- Results the same

 

- Repeat above steps using a scripted backup (from the Run menu)

- Results the same

 

Notes:

Net Retry was observed on some of the tests

Clients time out took between 50 seconds and 120 seconds (according to the log)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...