Jump to content

error 519 (network communication failed)


Recommended Posts

I've been using Retrospect for almost seven years now, and most of the time I have no problems. In fact it's saved my butt on several occasions. Here's my problem:

 

The backup server is a PowerMac G4/500 - 2GB RAM, running Mac OS X 10.3.9

Retrospect version 6.1.126

Retrospect Driver Update, version 6.1.7.101

Connected to a Sony AIT-3 autoloader via ATTO UL4S SCSI card.

 

Backup script A contains four PowerMac G5's

running Mac OS X 10.4.9 with Retrospect Client 6.1.130. Each client contains between about 60GB and about 130GB of data. These machines are used primarily for graphic design and video editing.

Backup script B contains just a Mac Pro file server Running Mac OS X Server 10.4.8. It's carrying about 270GB of data on two volumes.

Our scripts do a re-cycle backup every weekend to both sets, and incremental backups each night, Monday through Thursday. The incrementals are trouble free. The recycle backup on the Mac Pro is fine. The recycle backup on the G5's fails on all four machines, every week, with error 519 (network communication failed). I typically have to log in from home using Apple Remote Desktop several times over the course of the weekend and initiate normal backups on script A to get all the data on tape. This is typical of the errors we get:

 

Trouble reading files, error 519 (network communication failed).

3/25/2007 8:34:07 PM: Execution incomplete.

Remaining: 46830 files, 114.8 GB

Completed: 63970 files, 14.6 GB

Performance: 363.1 MB/minute

Duration: 00:46:54 (00:05:46 idle/loading/preparing)

 

And:

 

Trouble reading files, error 519 (network communication failed).

2/25/2007 9:20:45 AM: Execution incomplete.

Remaining: 5588 files, 12.2 GB

Completed: 49040 files, 76.9 GB

Performance: 396.9 MB/minute

Duration: 03:43:51 (00:25:35 idle/loading/preparing)

 

By way of comparison, The log file shows this for the Mac Pro:

 

3/24/2007 12:32:55 AM: Execution completed successfully.

Completed: 74423 files, 236.6 GB

Performance: 537.9 MB/minute

Duration: 07:32:40 (00:02:20 idle/loading/preparing)

 

I should also mention that before the Mac Pro, their server was a PowerMac G4, which also backed up without incident.

 

The client machines are in a different building from the backup server.

The network between them looks like this:

10/100 Linksys switches at each end connected by fiber-optic cable and 100Mbps media converters.

All ethernet cabling is CAT 5.

 

What I've tried:

Make sure clients are sitting at the login screen after restarting, to make the systems as quiescent as possible.

Restart the backup server just before the script begins.

Change ethernet patch cables.

Change ports on the switches.

Changing the sync rate on the SCSI card from the default(Currently set to 160DT IU).

Move one of the clients to this building, to rule out a problem with the fiber-optic link.

Change network setting on one client, forcing it to 100Mbps instead of auto-negotiate.

Uninstall and reinstall Retrospect Client.

Backup using a different server machine(also a G4) and a different autoloader.

 

One last thing. We have several other G5's that give me very little trouble, but they're not graphic design stations.

I hope I've included all the pertinent info without being overly verbose.

 

Any tips or suggestions will be greatly appreciated.

Link to comment
Share on other sites

Are you by any chance using link encryption for those clients? There used to be (maybe still is) a bug that caused backups to fail with a 519 error at the point the backup switched to a new tape member if link encryption was enabled.

 

You don't have the latest Retrospect Driver Update; that may be worth a try. EMC does not regularly detail what bug fixes are included in each RDU.

Link to comment
Share on other sites

Thanks for the reply. I just checked and am not using link encryption on the clients. I actually do have driver update 6.1.9.102 installed, sorry about that. I started trying to write my original post last week and neglected to update that bit of info after installing the RDU update.

Link to comment
Share on other sites

Quote:

The recycle backup on the G5's fails on all four machines, every week, with error 519 (network communication failed).

 


 

Retrospect doesn't treat Clients any differently when the Destination Backup Set has recently been recycled then it does when that Backup Set has many sessions. So the fact that your communication failures happen only during your once-a-week Recycle script is only a hint at whatever might be the real reason.

 

> This is typical of the errors we get:

 

But you haven't included any information from the log _before_ the error. We know what the error is; what we don't know is what's gong on before that.

 

Two things that come to mind that might account for the regularity of the failures; your Recycle backup probably takes longer, and they always happen on the same day.

 

Either of these could be explained by a machine falling asleep. What are the Energy Saving settings on these clients? As surprising as this might sound, the Retrospect OS X Client will not keep a client awake when Retrospect moves from Copy to Compare.

 

What is Retrospect doing when the failure happens? Is it during the Copy phase? The Compare phase? Scanning?

 

Does it happen during the same phase for each problem client?

Link to comment
Share on other sites

Always during the copy phase, verification is turned off in Retrospect. I checked the energy saver setings, and they are all set to never put the system to sleep. This script begins 14 hours after the close of business on Friday, so I hope it's safe to assume they're not going to sleep. Here is a more complete clip from the log:

 

3/31/2007 7:00:20 AM: Connected to emercer

* Resolved container emercer to 2 volumes:

ArtPart on emercer

Thelma on emercer

3/31/2007 7:00:19 AM: Recycle backup: The backup set was reset

 

- 3/31/2007 7:00:19 AM: Copying ArtPart on emercer…

Trouble reading files, error 519 (network communication failed).

3/31/2007 7:53:25 AM: Execution incomplete.

Remaining: 2132 files, 34.9 GB

Completed: 1832 files, 16.2 GB

Performance: 351.2 MB/minute

Duration: 00:53:06 (00:06:02 idle/loading/preparing)

 

- 3/31/2007 7:53:25 AM: Copying Thelma on emercer…

Trouble reading files, error 519 (network communication failed).

3/31/2007 9:39:15 AM: Execution incomplete.

Remaining: 242512 files, 123.4 GB

Completed: 249614 files, 24.8 GB

Performance: 291.4 MB/minute

Duration: 01:45:50 (00:18:59 idle/loading/preparing)

 

Thanks for taking the time to help.

Link to comment
Share on other sites

Of the two examples given, the shortest communication time is about 53 minutes. Looking back through your log of the failed executions, do the lengths of time get much shorter then this?

 

Of the backups that compete successfully, do any of them take longer then the shortest time that fails?

Link to comment
Share on other sites

In looking at the logs over the last couple of months, this is what I found

The shortest time for a failed attempt seems to be about 22 minutes:

 

2/10/2007 7:42:08 AM: Copying Volume2 on client1…

Trouble reading files, error 519 (network communication failed).

2/10/2007 8:04:37 AM: Execution incomplete.

Remaining: 2377 files, 37.0 GB

Completed: 1187 files, 9.4 GB

Performance: 518.5 MB/minute

Duration: 00:22:29 (00:03:58 idle/loading/preparing)

 

The longest time I could find for a successful backup was an immediate backup(following a failed scripted backup) at about 5 hours:

 

Executing Immediate Backup at 3/25/2007 12:09 PM

To backup set R-Designers - 1…

 

3/25/2007 12:09:46 PM: Copying Macintosh HD on client3…

3/25/2007 12:09:46 PM: Connected to client3

3/25/2007 5:08:54 PM: Execution completed successfully.

Completed: 138849 files, 105.7 GB

Performance: 364.2 MB/minute

Duration: 04:59:08 (00:02:00 idle/loading/preparing)

 

Thanks

Link to comment
Share on other sites

Quote:

- 3/31/2007 7:00:19 AM: Copying ArtPart on emercer…

Trouble reading files, error 519 (network communication failed).

3/31/2007 7:53:25 AM: Execution incomplete.

Remaining: 2132 files, 34.9 GB

Completed: 1832 files, 16.2 GB

Performance: 351.2 MB/minute

Duration: 00:53:06 (00:06:02 idle/loading/preparing)

 

- 3/31/2007 7:53:25 AM: Copying Thelma on emercer…

Trouble reading files, error 519 (network communication failed).

3/31/2007 9:39:15 AM: Execution incomplete.

Remaining: 242512 files, 123.4 GB

Completed: 249614 files, 24.8 GB

 

 


I find it interesting that you had a network communication failure on the computer "emercer," but that Retrospect was then immediately able to reconnect and spend a considerable amount of time successfully backing up another volume on the same computer.

 

This would seem to indicate that the network interruption was of a short duration; yet, in our experience, short duration glitches usually result in a series of "Net Retry" messages (these are pop-up windows that are not noted in the log), with the backup valiantly trying to plod through, perhaps eventually giving up when the access interruptions get to be too frequent or too long.

 

If you review your Operations Log, are the backups always able to immediately access a second volume on a client when the backup of the first volume has failed due to a 519 error, or do you sometimes get the -1028 Client not visible error message?

 

Could the G5s all be running some process having periodic bursts of activity that would hog the processor? Have you tried running backups with nobody logged in (i.e., at the login window) so that no user processes are running?

Link to comment
Share on other sites

Reviewing the logs back to the beginning of the year I did come accross this sequence several times, all of the -1028 errors coming on the same client:

 

2/3/2007 8:41:36 PM: Copying Thelma on emercer…

Trouble reading files, error 519 (network communication failed).

2/3/2007 9:21:56 PM: Execution incomplete.

Remaining: 35730 files, 14.5 GB

Completed: 170 files, 5.5 GB

Performance: 270.4 MB/minute

Duration: 00:40:20 (00:19:44 idle/loading/preparing)

 

- 2/3/2007 9:21:57 PM: Copying ArtPart on emercer…

Trouble reading files, error 519 (network communication failed).

2/3/2007 9:47:57 PM: Execution incomplete.

Remaining: 1948 files, 28.7 GB

Completed: 1039 files, 8.3 GB

Performance: 384.2 MB/minute

Duration: 00:26:00 (00:04:02 idle/loading/preparing)

 

2/3/2007 9:47:58 PM: Connected to jbolton

 

- 2/3/2007 9:47:59 PM: Copying Macintosh HD on jbolton…

Trouble reading files, error 519 (network communication failed).

2/4/2007 12:53:02 AM: Execution incomplete.

Remaining: 28808 files, 57.3 GB

Completed: 103102 files, 60.6 GB

Performance: 376.2 MB/minute

Duration: 03:05:03 (00:20:16 idle/loading/preparing)

 

2/4/2007 12:53:03 AM: Connected to jpfund

 

- 2/4/2007 12:53:04 AM: Copying Macintosh HD on jpfund…

Trouble reading files, error 519 (network communication failed).

2/4/2007 5:11:10 AM: Execution incomplete.

Remaining: 40867 files, 31.3 GB

Completed: 194656 files, 81.7 GB

Performance: 374.1 MB/minute

Duration: 04:18:06 (00:34:40 idle/loading/preparing)

 

Can't access backup client Russell's G4, error -1028 (client is not visible on network).

2/4/2007 5:11:16 AM: Execution incomplete.

Total performance: 370.4 MB/minute

Total duration: 08:29:34 (01:18:42 idle/loading/preparing)

 

IIRC, we reinstalled the Retrospect client and it hasn't happened in the last eight weeks.

 

For the last four weeks I have been instructing these users to resart at the end of the day on Friday and leave the computer at the login screen. Compliance has been good, but not perfect. I'd say about 90%.

 

Thank you

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...