Jump to content

Freeze on 'Preparing to Execute' server 5.1 Cube firewire 250GB WD drive wiebetech ata6 oxford bridg


Recommended Posts

After scanning any of several desktops or laptops, server 5.1 hangs at the Preparing to Execute phase. Pizza time. Forever. About 20 seconds after starting this phase, the drive seems to be given an instruction to do something, perhaps a a reset? After that nothing happens. Other processes are unaffected, but any attempt to access the firewire drive causes the finder or whatever other app to hang. Pizza time.

 

Configuration Retrospect Server 5.1 on Cube 10.2.6 640MB using firewire to Wiebetech ATA6 Oxford bridge to Western Digital 250 GB drive. This configuration has been working fine for over 6 months using a Seagate 120GB drive.

 

Feel free to ask for more details.

 

Link to comment
Share on other sites

Update 20030926:

Backup configuration is to 'File'. There is nothing else on the drive.

 

Starting from a clean boot, the first machine on the list (that is available) processes correctly. Subsequent machines hang as described above.

 

My NIDS (Snort) also reports that the target machine is requesting an "RPC mountd TCP mount request" at about the same time. (This may be a false lead but is curious nonetheless.)

Link to comment
Share on other sites

Quote:

Starting from a clean boot, the first machine on the list (that is available) processes correctly. Subsequent machines hang as described above.

 


 

- Does it always hang on the same Client machine?

- Is the first, successful Client always the same machine?

- Have you checked the drive(s) of the machine(s) that are being accessed when the program hangs?

Link to comment
Share on other sites

1) hangs on different clients, depending on the processing order. But it is always the second client processed.

2) The first, successful client is not always the same one. I have successfully backed up three (different) remote clients by restarting.

3) The drives of the remote clients are fine. The machines themselves are operating normally. The backup target drive (250GB) checks out fine using DiskUtility and DiskWarrior.

 

There is definitely some command that is sent by Retrospect that causes the backup target drive to 'reset' or some such. A definite 'clunk' of the heads is heard then the drive becomes unaccessable. The drive otherwise behaves normally with all other applications when Retrospect is not active.

 

Are there any logs that would help more than this?

+ Normal backup using Backup 20030923 at 9/26/2003 6:07 PM

To backup set Backup 20030923…

 

- 9/26/2003 6:07:32 PM: Copying Enterprise on Enterprise…

 

? Retrospect version 5.1.175

launched at 9/26/2003 6:47 PM

+ Retrospect Driver Update, version 4.0.103

 

Link to comment
Share on other sites

Quote:

- Have you checked the drive(s) of the machine(s) that are being accessed when the program hangs?

 

3) The drives of the remote clients are fine. The machines themselves are operating normally. The backup target drive (250GB) checks out fine using DiskUtility and DiskWarrior.

 


 

- But did you check out the drives that crashed Retrospect? Or do you say they're fine because they're able to boot the Macintosh without a problem? Retrospect has always represented one of the most resource (hardware, network, etc) intensive routines you can do with a computer. (note that Dantz has suggested here that in some situations corrupt Client drives can have adverse effects on the program; not a good thing, but possibly something they'll fix)

 

- Is this a Backup Server script, or is it a Backup script? Try it both ways; does the problem occur the same way on each sort of Retrospect script?

 

Dave

Link to comment
Share on other sites

When machine 'A' was second in line to process, the backup succeeded on the first machine but failed to backup machine 'A'. However, when machine 'A' was first in line to process, it backup up fine, but another machine did not process. Since machine 'A' did finally backup successfully, I concluded that the source and target disk drives were OK and the problem lay elsewhere. I have since verified several of the source drives and the target drive (again) with DiskUtility, which reports they are fine.

 

Also, the computer is set to never sleep, and the option to put the hard disks to sleep is not checked.

 

This is Backup Server Script. I'll try it the other way in a bit.

Link to comment
Share on other sites

Quote:

This configuration has been working fine for over 6 months using a Seagate 120GB drive.

 


 

- Was the Seagate drive in the same FireWire case as the current drive?

- Will it work again if you backup to the Seagate?

- If you Define a small subvolume on Machine C, put it first in the list, then allow it to finish (quickly) and proceed to Machine D, will it hang? Every time?

 

 

Dave

Link to comment
Share on other sites

1) The seagate drive and the western digital use exactly the same connections.

 

2) The seagate was repurposed so I don't have a spare drive at the moment. But I'll get a < 120GB drive to test the theory that a smaller drive / non-WD drive works. Seems like the logical next step.

 

3) I'll set up small backup runs after 2) above.

 

 

 

BTW, in a test, Immediate Backup failed in the same way; the drive went off-line before copying began.

Link to comment
Share on other sites

Quote:

the drive went off-line before copying began.

 


 

This information caused me to go back and re-read your original post, where you say:

 

>>any attempt to access the firewire drive causes the finder or

>>whatever other app to hang. Pizza time.

 

If the drive unmounts from /Volumes/ then of course Retrospect is going to have problems when it expects the volume to be available, so the hang is expected.

 

The size of the hard drive doesn't matter to Retrospect, but it certainly _does_ matter to the bridge. I'd guess that the FW case you have is not properly supporting a drive of this size (I can't remember the specifics, but there are hardware addressing issues for these new, large size devices).

 

Check the specifics of the case, or check with the manufacturer. If it doesn't explicitly support a 200+ gig drive, then that's probably the cause of the fault.

 

Dave

Link to comment
Share on other sites

It's a good observation, but the first thing I checked. The bridge is Oxford 911 from Wiebetech. It is ATA6 compliant and supports drives up to 300GB.

 

re: unmount; I used 'going offilne' in a general sense. Don't really know if the drive is receiving an unmount or something else is going on.

 

If we could do 'backup sets' using firewire drives, then I'd get a few 120GBs and have done with this...

Link to comment
Share on other sites

Quote:

I used 'going offilne' in a general sense. Don't really know if the drive is receiving an unmount or something else is going on.

 


 

- What happens in the Finder? Does the disk remain on the Desktop (if your preferences allow it to)?

- What about in terminal? Does it continue to show up in /Volumes/ ?

 

 

Dave

Link to comment
Share on other sites

Right. Since the last time, I

 

1) reconfirmed the behavior; first remote works, second hangs.

 

2) replaced the 250GB with a 40GB Seagate. Did two local partitions and three remote clients successively with no problems. The disk then filled up and I stopped the drill.

 

3) reconnected the 250GB which happily backed up the first available remote, but then hung on the second one. Same as before.

 

 

 

One could conclude that this is a problem with a) big drives or B) WD drives.

 

 

 

Since I had the opportunity I also updated the OS to 10.2.8 (the new one!) in case there was some firewire 'stabilization' work that had been done. There was no difference.

 

 

 

I would like to hear what others are using in the way of large drives for capturing backups. Anyone else using >160GB drives?

 

 

 

Regarding the disk icon on the desktop; yes it stays visible, but any attemt to touch it resules in Pizza.

 

 

 

Regarding the test to see what Terminal says when you ls -l the Volumes; I'll try and let you know. However a deeper log source would be more satisfying. Any ideas as to where I should look?

 

 

 

Thanks,

Link to comment
Share on other sites

Hello,

I'm having problems along similar lines with my Lacie 230gb firewire disk.

The disk is shared accross our network in OSX and clients back they data up to that manually, which I was then planning on archiving onto a tape drive.

However, when archiving onto the tape from the firewire drive, the verification stage hangs Retrospect. This can be force quit but either way the firewire disk icon on the desktop cannot then be accessed as it freezes the whole system and you have not option but to restart.

This leads me to the suspicion that the firewire disk is unmounted somewhere along the way by Retrospect or just looses the connection somehow because of a system bug.

I have tested backups from the internal harddrive on my computer and that works absolute fine, so it is definately a firewire disk issue.

Anyone got any ideas for a cure?

Cheers,

Phil

Link to comment
Share on other sites

The drive is formatted HFS+

Great question regarding the size of the drive. I formatted the drive into 2 120GB partitions, backing up into one of the two partitions. The backups failed in the same was as the single 250GB partition.

 

Is there *any* evidence that Retrospect will successfully incrementally backup multiple remote clients to a >= 160GB external firewire drive?

Link to comment
Share on other sites

Quote:

Is there *any* evidence that Retrospect will successfully incrementally backup multiple remote clients to a >= 160GB external firewire drive?

 


 

Yes.

 

I've backed up to a 200 gig Maxtor OneTouch brand external FW drive many times; with single and multiple partitions.

 

The evidence seems to point to your specific hardware.

 

But those LaCie cases sure are pretty...

 

 

Dave

Link to comment
Share on other sites

Hi

 

Thanks for trying the smaller partitions. This should work no problem even at 250GB.

It is pretty safe to say that there is a problem with this drive or your firewire controller.

 

Can you try a backup to another firewire disk on this machine? How about trying your 250GB drive on another computer and running a backup there?

 

Nate

Link to comment
Share on other sites

I've run tests on the disk and copied lots-and-lots of data onto the drive to exercise it. All worked fine.

Is is suspicious that the failure with Retrospect is always in the same place?

I've already done a backup to another firewire disk (40GB) on the same machine using the same setup. That worked fine.

Don't really have another computer to move the drive to other than a windows server or a linux server. Those options will be considered soon.

Link to comment
Share on other sites

Hi

 

The problem is Retrospect pushes a TON of data to the disk at one time. Much more so that you would in a standard file copy operation. If there is a problem on the drive - Retrospect will be the one to find it.

 

Any chance you can exchange this drive or try another disk in the enclosure?

 

Nate

Link to comment
Share on other sites

Right. I've discovered I can cause the 'reset' at will by using Retrospect's Tools -> Verify or Tools -> Repair. It *immediately* sends whatever message to the drive that causes it to behave as if it is waking from a sleep, however it does not follow through by starting to read/write data. Pizza mode. Only powering down the drive and starting it up again allows it to be recognized. Somebody please read the code.

 

Link to comment
Share on other sites

Quote:

I've discovered I can cause the 'reset' at will by using Retrospect's Tools -> Verify or Tools -> Repair. It *immediately* sends whatever message to the drive that causes it to behave as if it is waking from a sleep, however it does not follow through by starting to read/write data. Pizza mode. Only powering down the drive and starting it up again allows it to be recognized.

 


 

Both "Tools->Verify" and "Tools->Repair" present the user with an additional dialog box. When you say "*immediately*" do you mean _immediately_, or does the problem evidence after some additional user interaction (in other words, a review of the code will be much less efficiant if the engineers are looking in the wrong place).

 

If you create a small File Backup Set (say a single defined volume with limited data) and use that as the catalog to Repair or Verify, does it happen each time?

 

Dave

 

 

Link to comment
Share on other sites

To be precise, after all the dialogs are answered and the panel with the progress bar appears, the pizza shows up. However, in doing the test to confirm this a few minutes ago, the Tools->Verify process did start successfully. Sigh.

 

I have access to a Win2000 server that has an open position on the secondary IDE bus. I can move the drive there in minutes. Do my licenses work on the Windows product? Which Windows product corresponds to the Server product on the Mac side? Can I download the software?

 

Thanks.

Link to comment
Share on other sites

The drive has been moved to the Intel server and formatted as a 128GB drive (the max recognized by the onboard controller.) An evaluation copy of Retrospect Multi Server for Windows has been loaded and 6 remote clients identified. It is on 6 of 7 now and operating just fine. Unless the format to 128GB has some bearing, the drive would seem to be OK. Any suggestions for next steps?

Link to comment
Share on other sites

Quote:

Unless the format to 128GB has some bearing, the drive would seem to be OK

 


 

My personal Gut Feeling was not the drive, but the FireWire bridge used between the drive and the FireWire bus on the cube.

 

Any or all of the following tests would yield clues to where the problem lies:

 

- Use the existing FW bridge on the original Mac

- Use the existing FW bridge on a different computer (Mac or PC)

- Use a different FW bridge on the original Mac

- Use a different FW bridge on a different computer

- Put the drive on the IDE bus of the offending machine (impossible on the cube, but would be helpful if all this were happening on a pro enclosure Mac)

 

Link to comment
Share on other sites

  • 3 weeks later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...