Jump to content
Sign in to follow this  
derek500

519 errors on XServe RAID client

Recommended Posts

Hi,

 

I frequently get 519 errors on a certain client. I'm using Retro Server 6.1.126 with RDU 6.1.2.102 on a 10.3.9 client PowerMac G4 to AIT-2 tape via SCSI. The Client is OS X Server 10.4.5 on a Dual G4 XServe with client 6.1.107. The Xserve is running Kerio email server and a few minor services, nothing else. The mail store is on an Xserve RAID. Nightly the mail server creates it's own backup of the email store to .zip files. The mail server backs up 'full' weekly, and 'differentially' the rest of the week. The backup files being backed up are also on the RAID, only the static zip files are backed up - not the live mail store. The zip files total about 60 GB for the full, 1-2 GB for the differential files. The config in the email server limits the size of each zip file to 2 GB, so the mail backup consists of about 30 2GB files. It's usually when backing up the main weekly backup files that the -519 errors occur. I never have any problems backing up the Xserve's local HD.

 

The Xserve has a gigabit connection to our LAN, the Retrospect Server only has a 100 MB connection. They are both connected to the same switch, an Asante GX5-424W Gigabit switch. I don't see any packet collision happening. Performance is reasonable for the zip files, around 350 Mb/Sec to the tape drive. If I restart a failed backup sometimes it will complete, sometimes it -519s again. If I restart it again, same thing - sometimes it completes, sometimes not. Eventually it will complete. The speeds stay relatively good up until the failure.

 

Where do I start troubleshooting? It's the only volume I have problems backing up out of 80 clients, so I don't see this being a SCSI problem although the hardware compression stinks. I know copy speeds from the RAID are good, and it performs well as a mail server. Finder copy speeds of those same zip files to the backup server are reasonable and the finder copy doesn't fail. I can reproduce this error for most (but not all) full backup from the Retrospect server, and after most (but not all) Full backups that Kerio makes of it's mail store. The failure point is usually anywhere between 20-30 GB of data.

 

Thanks

 

-Derek Cunningham

Share this post


Link to post
Share on other sites

Hi.

 

-519 happens when the network connection is lost (for whatever reason). You write you don't see any collisions, but I would start looking for lost packets.

 

Hope this helps

Lennart

Share this post


Link to post
Share on other sites

Hi Lennart,

 

I did see some lost packets in the switch for those ports, but it was not very many, and the counters hadn't been reset in ages. I reset the counters and I'll look there after the next failed backup. Thanks for the advice.

 

If there are lost packets happening during the backup, what can I do about it?

 

Does it only take 1 lost packet to bring up the -519?

 

Thanks,

-Derek

Share this post


Link to post
Share on other sites

>Does it only take 1 lost packet to bring up the -519?

 

No, it takes more than one lost packet. I don't know how many, my guess is 10-15 in a row at least.

 

Sorry, but I don't think I can be of more help. Seems to be a network problem to me.

Share this post


Link to post
Share on other sites

You won't see collisions when connecting to a switch. But I would check the switch port and client/server network card settings to make sure they match. E.g. Switch port set to auto-negotiate, and client to fixed full duplex will result in lots of errors and lost packages.

Share this post


Link to post
Share on other sites

Thanks for the help so far. I reset the switch counters yesterday, and last night the same thing - 21.9 GB into a 40 GB backup a -519 error. No packet losses to any port on the switch. Switch speeds are negotiated correctly (1 GB full on the XServe, 100 MB Full on the Retro Server's G4) What's the next troubleshooting step?

 

Thanks,

 

-Derek

Share this post


Link to post
Share on other sites

Just a shot in the dark: does the XServe client by any chance have link encryption enabled? If so, you might try unchecking that option. We experienced some flaky network behavior (515 rather than 519 errors) that went away after we deselected that option.

Share this post


Link to post
Share on other sites

Thanks for the suggestion. Unfortunately, no. Definitely no link encryption. No problems last night and it backed up 40 GB from that client. I'm sure next Monday I'll see the same problem again though. Any other suggestions?

 

THanks

 

-Derek Cunningham

Share this post


Link to post
Share on other sites

I'm still having the same problem. I watched the packet count in the switch and there are no lost packets or collisions reported. Performance is decent until it quits. What do I do next? Here's what it looks like in the log:

 

- 5/8/2006 7:22:09 PM: Copying Kerio Backups on Mailserver

 

Trouble reading files, error 519 (network communication failed).

5/8/2006 9:28:56 PM: Execution incomplete.

Remaining: 15 files, 27.0 GB

Completed: 18 files, 35.9 GB

Performance: 306.2 MB/minute

Duration: 02:06:47 (00:06:50 idle/loading/preparing)

 

Thanks for any assistance!

-Derek

Share this post


Link to post
Share on other sites

"The Client is OS X Server 10.4.5 on a Dual G4 XServe with client 6.1.107"

 

These were the latest updates we had available when we started watching this problem closely. I've seen a few minor issues reported after Server 10.4.6, so I'm waiting a bit on that update. I see that they recently released some client updates as well, I can certainly try that but I don't see our issues mentioned in the new client documentation, so I'm not very hopeful. Can you update the onboard NIC drivers separately?

Share this post


Link to post
Share on other sites

Quote:

Can you update the onboard NIC drivers separately?

 


No, I don't think so.

 

Sorry, but I'm stumped.

Share this post


Link to post
Share on other sites

Well, thanks Lennart and all those who tried to help me. The problem wasn't any of the things that we were discussing, but a preliminary warning to a disk failure. A few days after my last post in this thread, we started seeing index corruption errors from our mail server logs. We ran a 'background conditioning' on the RAID array and sure enough one of the disks turned up bad. We replaced the disk and found that the directory was hosed, so we had to temporarily relocate our mail store to a different drive, replace bad drive, rebuild the RAID array and move things back home. It took a while to complete, but since then we haven't had any -519 errors from that client. Turns out Retrospect was warning us to imminent failure.

 

I'm surprised the SMART hard drive controllers in the Xserve RAIDs didn't alert us to the problem until we ran the conditioning. I highly recommend if you have an XServe RAID to update to the latest firmware and run the background conditioning if you are having any issues, especially if the RAID is seeing severe duty like a mail server (many many read/writes etc). It can take a few days to complete, but we've decided to run it every 6 months from now on just to stay on top of it. The conditioning can mark out blocks as bad and avoid them, even if it doesn't register enough bad blocks to fail the disk.

 

Hope all this helps someone else with a similar problem.

 

-Derek

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×