Jump to content
Sign in to follow this  
dfoster

Write Failures to External Firewire Dirives

Recommended Posts

For some reason I keep getiing write failures to our external firewire drives when there is no user logged on the the box. If a user is logged on (either via the local console or even a disconnected terminal session) the backup work perfectly, but if the box is left without a user logged in, I get the write failures again which corrupt the backup set.

 

I have checked the drive, updated the bridge chip firmware and swapped controller cards to no avail. Seems to be something odd the retrospect is doing.

 

While I can work around by ensureing that there is always someone logged on, this is not ideal as it means that the backups are corrupted if the box restarts itself (crash, power outage etc) and no-one gets to it before the scripts run.

 

Daniel.

 

------

Daniel Foster

Systems Administrator

Faculty of Arts, University of Western Australia.

Share this post


Link to post
Share on other sites

Hi

 

What are the write errors you are seeing? Can you post any Retrospect logs or system event logs on the subject please?

 

Are the firewire drives optimized for quick removal or for performance? I would try setting them for Quick removal and see if the errors continue.

 

Thanks

Share this post


Link to post
Share on other sites

 

Needing a users logged it seems now to have been a red herring - we are now getting write errors both whether logged in or not (I guess we were just getting luck in earlier testing...). As a result it now looks like a Windows issue rather than a Retrospect one. This has been confirmed through some large file copies between drives which can be used to replicate the error. Now to try and get a response from MS - I may get lucky....

 

Mostly we are getting sbp2port errors stating that " The device, \Device\Sbp2\Oxford Semiconductor Ltd. , did not respond within the timeour period."

 

We also get the occasional FTdisk error "The system failed to flush data to the transaction log. Corruption may occur."

 

The result in retrospect is that the execution unit hangs and in rare cases, the FTdisk error results in the entire disk being hosed.

 

We have tried having the drives set for both performance and quick remove - neither setting affects the rate of errors. The drives are formatted with NTFS (they are 250GB drives) so Windows complained about having the quick removal setting on.

Share this post


Link to post
Share on other sites

Hi

 

FWIW it seems more like a hardware issue than a Windows issue to me. I would try a different Firewire controller to see if you have better luck. I would at least try moving the Firewire card to a different slot.

 

Thanks

Nate

Share this post


Link to post
Share on other sites

Doesn't appear to be controller related - swapped controller cards, PCI slots Drive enclosure and still get said errors - very frustrating. Also tried firmware updates, resricting the size of requests to 128K (an MS suggestion) with no result.

 

Google-ing the error shows that it is actually a pretty common problem and that there is no single solution. In my case it looks related to bus utilisation in that the error rate increases when more than one drive is being written to at once. Time to try a different server - maybe the PCI bus on the Sun is the source....

 

Daniel.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×