Jump to content

Retrospect hangs when backing up a windows client


Recommended Posts

I'm attempting to back up a 4 TB volume full of sound effects. For historical reasons, this volume is served from a windows pc. I've installed Retro client on it and I can log into it just fine. This is the *only* windows pc that I back up. Approximately 100 to 130 GB into the backup, it just hangs. The Mac is still alive, the retro operation window is still up, but the current file and data copied does not change. This happens regularly around the 100 to 130 GB mark. I then must force quit retrospect, which dies easily. I can then immediately relaunch it and start the script again, which scans, starts backing up, then dies again shortly after 100GB. Rinse, repeat.

 

Any ideas?

 

FYI - OSX 10.3.9 (7W98) 2x500MHz G4 1GB RAM, Retrospect 6.0.204, driver 6.6.101, SDX700C, ATTO UL3D (driver 3.2.0, flash 1.6.6f0), windows server 2003 (latest updates/patches says the windows admin, *this can't change*), Retro client 6.5.136.

Link to comment
Share on other sites

bus speed already set to 20DT.

 

 

 

Command queuing enabled, switching to disabled.

 

 

 

I've configured six "source groups" containing portions of the 4TB volume. Each source group is backed up to a unique backup set, so there's six backup sets at work here. In addition, six scripts automate it all (one script per source group & backup set). Each execution of each script fails in the same manner.

 

 

 

We shall see what disabling command queuing has to offer. However, Retro clients running on other Macintosh OS X Servers do not cause the Retrospect to hang, and are capable of sustaining a multi TB backup from start to finish, including the compare.

Link to comment
Share on other sites

Have you noticed when in the backup the hang occurs? Does it happen at the time that the backup is switching to a new tape member?

 

We found in our network configuration, where we are backing up across switches with a lot of security features enabled, that Windows clients would cause the Retrospect app to hang when switching to a new tape, while OS X clients would register a 515 Piton Protocol violation.

 

We solved the Window hang problem by downgrading clients from v 6.5.to v6.0.110.

 

We are still living with the 515 error for our Mac clients.

Link to comment
Share on other sites

Quote:

Hi

 

Try using the ATTO configuration utility to disable command queuing and slow the bus speed to 20 DT. This will slow the tape transfer a bit but it should make everything more stable.

 

 


 

If the bus speed is set to 20 DT, does the Fallback Sync Rate (default 40 MB/sec.) have to be similarly reduced?

Link to comment
Share on other sites

I have been having "hanging" problems for several months. Looking back, it seems that it is the windows clients that are causing the hang - but not the same client. The hangs are intermittant - the backup scripts might run fine for a couple of days, and then suddenly hang up during the "Copying" phase of the backup. Tapes were not being changed - it has been in the middle of the script.

 

I suspected a communication problem. My ATTO card has been configured as Nate suggested, and my ATTO card has even gone back to ATTO for a checkup. I replaced SCSI cables and terminators, but still every couple of days Retrospect hangs during the copying phase.

 

In other words, Retrospect does not run reliably, and we all need reliable backups.

 

 

System: G4/400, 512 MB RAM,OS 10.3.9, Retrospect 6.1.126 (and all previous 6.0 versions), Quantum Valueloader SDLT320, ATTO UL4S.

Link to comment
Share on other sites

>We solved the Window hang problem by downgrading clients from v 6.5.to v6.0.110

 

ok, tried it, still hangs.

 

> We are still living with the 515 error for our Mac clients.

 

I don't have this error. My Mac clients back up fine.

 

>Tapes were not being changed - it has been in the middle of the script.

 

Agreed

 

>I suspected a communication problem.

 

I don't. I can back up local volumes, AFP volumes, and Mac clients all day long. I can even do it under OS 9. What I can't do is back up the windows client.

Link to comment
Share on other sites

Quote:

The hangs are intermittant - the backup scripts might run fine for a couple of days, and then suddenly hang up during the "Copying" phase of the backup. Tapes were not being changed - it has been in the middle of the script.

 


 

Are you sure your autoloader was not in the process of switching to a new tape? Unless you skip to a new member manually, the switch will always occur in the middle of a script.

 

The way we caught the link between tape member switches and the client backup errors/hangs was by going to Configure> Backup Sets>, noting the date and time that each backup set member was initialized, and then finding out from the log that the problem client was being backed up at that time.

Link to comment
Share on other sites

>Are you sure your autoloader was not in the process of switching to a new tape? Unless you skip to a new member manually, the switch will always occur in the middle of a script.

 

I thought we covered this.

 

>would you be able to test a file backup of your Windows client to a hard drive.

 

I'm not sure I follow. You want me to back up a chunk of the windows volume, via Retro client, to a locally attached HD (firewire) on my OS X retro server, and wait for it to hang? I can do that.

Link to comment
Share on other sites

Hi, Thanks for your reply.

 

No, the computer doesn't hang while the autoloader was changing tapes. For instance, on Saturday night the server froze again after running flawlessly on Friday night. The script it froze on backs up 9 clients to a single tape (SDLT 320); it froze while copying client #6. The tape is less than half full, and I ran a successful immediate backup today to make sure that it is not full. The Configure>Backup Sets>Members shows that the tape was initialized on 5/26/05 and contains 77 GB (of a nominal 320 GB). Oddly, Configure>Backup Sets>Members for another tape shows that it contains 527.7 GB! 207.7 GB more than the tape's maximum!!

 

Kazeeks problems above, and on another thread (lost access to storage medium) all point to communications problems. The question is whether they are hardware or software related. Since Kazeeks describes the same problem on more than one system, and Retrospect has inaccurately reported a tape's contents (above), it would appear that Retrospect is not handling the data correctly. I am testing another backup application to see if I get similar communications problems with different software. I would like to hear from Retrospect moderators to try to solve this problem, or is this beyond the scope of this forum?

 

Anyway, thanks for your input.

Link to comment
Share on other sites

Hi ringg

 

In general OSX support for SCSI tape devices has been horrid. Now with the updated ATTO drivers we have some degree of reliability.

 

Have you dropped the speed of the adapter to 20 DT and turned off command queuing? I have seen that work wonders for many...

 

Kazeeks, how did the backup to hard disk go?

 

Thanks

nate

Link to comment
Share on other sites

>Kazeeks, how did the backup to hard disk go?

 

I won't get an opportunity to run this test until some time next week. The windows server is always busy, and I won't get a window for a week or so. Luckily, the contents of this server are fairly static, unlike my other production fileservers.

Link to comment
Share on other sites

Quote:

Have you dropped the speed of the adapter to 20 DT and turned off command queuing? I have seen that work wonders for many...

 


 

Hi, Nate,

Thanks for responding. Yes, the speed of the adapter is set to 20 DT for both SCSI ID's, Tag Command Queuing disabled, and the Fallback sync rate is also set to 20 (10). I am using an ATTO UL4S, Driver 3.4.0, Flash v. 1.4.2f1. These are the most recent driver and firmware. According to ATTO tech support the drive nominally communicates with the computer at 80 MB/sec, although I got multiple communication errors when I used this speed. Lowering it to 20 produced the fewest errors, but, still every 2 or 3 days the computer will hang. The last entry on the log shows that it was in the "Copying" phase. However, it might have been scanning the client or switching to Compare at the time of the hang-up. I can't tell from the log.

 

I upgraded to Retrospect 6 when I bought my Valueloader last year, so I changed three components simultaneously (software, tape library and SCSI card). Previously, however, I had been running Retrospect 5.1 on a G4/400 OS 10.2.8 with a Quantum DLT 4500 tape library with much better reliability.

 

Kazeeks, however, only changed his software to OS X and Retrospect 6. The hardware remained the same. This certainly points to a software problem.

This is a real problem since Kazeeks and I cannot run backups reliably several days in a row. My computer usually hangs up over a weekend (Friday or Sat. night). If there is a long weekend like Thanksgiving, or I go on vacation, the backups don't run for most of the time that I am absent.

 

Has this problem been reported in the Windows version (or are there worse problems)? I am open to any suggestion to get reliable backups.

 

Thanks

 

George

Link to comment
Share on other sites

Hi ringg,

 

I'll bet you a cup of fine sake that it is not a software problem wink.gif A machine hang is classic SCSI communication failure. Same goes for communication errors.

 

When the computer is hung can you use the ATTO SCSI utility to rescan the SCSI bus? Or does that hang too?

 

Are there any other device attached to the SCSI adapter? Have you tried changing SCSI IDs and cables? Have you tried a new terminator?

 

Thanks

Nate

Link to comment
Share on other sites

Hi, Nate,

 

 

 

If you get this cleared up I'll buy you a whole bottle of Sake! First of all, perhaps I have been using the wrong term. My backups run overnight. When I come in to work in the morning on days when a backup failed, my computer is completely frozen. The screen is black (as expected because I have the screen - not the system - set to sleep after one hour of inactivity), and the monitor power button is red. Moving the mouse or hitting any key doesn't wake it up; I have to push the restart button. Thus, I cannot use the ATTO SCSI utility until I restart. When the login screen comes up there is a box which has "Retrospect !" and a stop sign. When I open the Retrospect log I see that the backups (up to 35 clients) ran to a certain point (could be any computer - apparently random) then stopped. The last entry is always "...copying".

 

 

 

In answer to your questions: There are no other devices connected to the computer, SCSI or otherwise. I have changed cables (twice) and terminator. I sent the ATTO card to ATTO tech support who checked it and returned it. In fact, I've been having communication problems ever since I bought the Valueloader, and Quantum even replaced my unit with a new one, and I switched G4s with a slightly newer model (350 to 400 MHz). The SCSI IDs had been set to 2 and 5 for several months for the Drive and the Library respectively, but I recently changed the Library to SCSI ID:4 . Is it significant that the Valueloader uses LVD communication rather than SE? Is data transferred over LVD more prone to corruption by an older G4? Do you think my problem might disappear with a G5 that has a faster PCI bus?

 

 

 

What is interesting about Kazeeks' account is that his hardware remained the same - systems that worked reliably with OS 9 and Retrospect 5 began failing with OS X and Retrospect 6. I cannot revert to Retrospect 5 to try to isolate the problem to software, because the Valueloader is not supported by the Retrospect 5 ATL.

 

 

 

So, I'm open to all suggestions. Thanks for any help.

 

 

 

George

Link to comment
Share on other sites

Hi

 

Certain Quantum drives used to be very picky about SCSI ID. If you change them back to 2 and 5 do you have better results.

 

A faster machine might help but it is hard to say either way. I would try the SCSI card in a different PCI slot and maybe even try another network card.

 

I would also disable all screensaver and power setings. Just turn the monitor on and off as needed.

 

Thanks

Nate

Link to comment
Share on other sites

>I'll bet you a cup of fine sake that it is not a software problem A machine hang is classic SCSI communication failure. Same goes for communication errors.

 

I don't drink, but I otherwise agree with that statement. However, this system reliably backs up Mac clients all night long, and hangs on the one Windows client. I don't have another Windows machine to test Retro Client on, so this is it for now. I also can't upgrade the Windows OS on that server; it's not mine.

 

BTW - the backup of the Windows server to the hard disk failed at 60GB in. I can only assume that the increased speed of backup to a local firewire HD fried Retrospect earlier than tape. I have to Force Quit Retrospect, then I can immediately relaunch it and continue the backup (of course I have to forfeit any chance of comparing). The next backup to the HD failed just under the 80GB mark. Force quit. Relaunch. Continue.

 

I suspected bugs in the RetroClient for Win software, but Client is always ok and responds right away when starting the backup again. This is more than I can say for RetroClient for Mac, which gets stuck when a Mac backup fails for a number of other issues (like the lost access to storage device error). If I try to restart the backup after that error, Client is offline, and I have to manually log-in to the FileServer, start RetroClient, and turn it off then on again to wake it up. (Pain in the A$$, BTW). Client for Win needs no such handslapping.

 

>I would also disable all screensaver and power setings. Just turn the monitor on and off as needed.

 

Already done.

 

<rant> I know this isn't the thread to do this, but the suits here are getting more and more unhappy with continual intermittent various issues with Retrospect, the noticable lack of Enterprise-class features, and the apparent lack of future development for Mac. Buying the Windows product is not an option. We're starting to look elsewhere. </rant>

Link to comment
Share on other sites

 

Hi:

I'll jump in the fire here. I can feel for you Kazeeks, the system people around here seem to have just as many problems with many other types of backup systems as I do with Retrospect.

Its interesting, but when I've had random hangs, its seemed to be with osx clients, not windows. My suggestions would be this, and I realise you don't 'own' the windows server, but I'd personally start with sniffing around that box. Run a chkdsk on it, uninstall and re-install the client software.

FWIW, I've had expereance with ATTO hardware support, and they knew their stuff and were very helpfull. Good luck with your problem.

-s

Link to comment
Share on other sites

Quote:

Hi

 

Certain Quantum drives used to be very picky about SCSI ID. If you change them back to 2 and 5 do you have better results.

 

A faster machine might help but it is hard to say either way. I would try the SCSI card in a different PCI slot and maybe even try another network card.

 

I would also disable all screensaver and power setings. Just turn the monitor on and off as needed.

 

Thanks

Nate

 


 

Hi, Nate,

 

I had been using SCSI ID's 2 and 5 for about a year, experiencing these problems. Only recently switched to ID 4, so I don't think that is the main issue. I have also switched the position of the SCSI card, but that didn't resolve the problem of hangs. I'll be happy to turn off all of the energy savers, and see if that helps. Let me ask a different question, though, related to SCSI issues. I understand that LVD SCSI can use long(er) cables than regular SCSI max of 15 ft. total. Do you agree with this? I am using a 3M LVD cable from the tape device to the computer. This should be well within the accepted limits of LVD SCSI. I will order a shorter cable anyway, but do you think that 3M is too long?

Thanks,

 

George

Link to comment
Share on other sites

Hi

 

If it is a good quality cable I think 3 meters should be fine but you just never know. SCSI can be so picky.

 

I suspect that either the cable is bad or something else is wrong. If length were a problem you would probably be able to see the device but it would fail intermittently during operation.

 

Thanks

Nate

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...