Jump to content

client rebooting during backup


emilio

Recommended Posts

Hello all,

 

There's been a trend for one of our Retrospect client machines to reboot in the middle of being backed up. This only happens sometimes, but it seems to be highly correlated to the times when the machine has a lot (hundreds or low thousands) of jobs waiting in a paused printer queue. When this has happened, it's happened when doing either a normal or recycle backup.

 

The setup is:

Server: Retrospect backup single server 6.5.319

Client: Retrospect client 6.5.132, Windows XP SP2, P4 3.4Ghz, 1GB RAM

 

There are several machines that get backed up by this server, but only this one has this intermittent problem, the rest get backed up flawlessly. This client machine also, incidentally, is the only one that runs these big jobs where many many print jobs are sitting in the (paused) printer queue. These big jobs run about twice a month, so the reboot doesn't happen very often, but it's still disconcerting. From the Retrospect server's log, it appears as a network error - Trouble reading files, error -519 (network communication failed) - that occurs about 4 hrs after that client's backup started, or roughly halfway through that client's backup session. The client's log doesn't have any trace of the activity.

 

Has anyone seen or heard of this?

 

Thanks,

Emilio

Link to comment
Share on other sites

Hi Emilio,

 

I haven't seen anything similar to this previously, but I'm wondering what kind of memory saturation you have when the backup fails. Are there any errors in the Windows event viewer? Is there anything that directly indicates that the print queue is responsible for the backup failure?

 

It seems a bit odd that the backup could run for four hours before failing. How much data is copied during that time?

 

As a starter, you could try updating to the latest version of 6.5:

http://emcinsignia.com/supportupdates/updates/retrospect/archive/#UPDATETYPE14

Link to comment
Share on other sites

Thanks Foster.

 

What the resources look like when the reboot happens is a good question. There isn't anything in the application or system event logs that correspond to the error. The closest thing to the reboot occurs several minutes before the backup starts, per the system event log. But that's simply a message indicating that the MS Software Shadow Copy Provider service has started. I don't think this is related.

 

Any suggestion that the printer queue is involved comes from my looking for clues. Is there another log by either the client or the server that might give more detail?

 

Actually, I just thought about trying to analyze the memory dump. I'll give that a shot.

 

Emilio

Link to comment
Share on other sites

Emilio,

 

There won't be anything more useful in the operations log at default settings. Call this a hunch - but if this error occurs only when the print spool is filled up (and assuming the print queue objects qualify as open files), then perhaps there is some difficulty snapping the open print spool when it exceeds a certain size. This will be easy to test in theory - just disable open file backup on the script responsible for that machine. However, it may be hard to prove in practice, since, as you said, this error cannot be reproduced reliably.

 

Please let us know your findings regarding the memory dump.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...