Jump to content

Retro 6.1 + OSX 10.3.9 - crashes when communicating with Quantum/Certance LTO-3


Recommended Posts

Hi,

 

I know that there is another post that's currently open at the moment which is similar to mine, but it definitely doesn't answer my issue, so I thought I'd start a new one.

 

Am running Retrospect for Mac 6.1.126 (6.1.7.101) on an XServe with Panther 10.3.9. Attached to the XServe is a brand new Quantum/Certance Ultrium 3 LTO-3 tape drive, running over SCSI. In the XServe is a ATTO Express PCI UL3D dual channel SCSI card. I've double checked compatibility of all devices using EMC's online compatibility query, and this Quantum/Certance drive has been thoroughly tested by yourselves and is listed as being supported.

 

The problem is - we've just purchased this new LTO-3 drive to replace an AIT-3 drive. It's connecting over SCSI, using the same internal ATTO card as the AIT-3 did. The new LTO-3 drive is being recognised in Retrospect, under Devices, and also in Apple System Profiler - there is no issue with the drive not being recognised, nor with the SCSI card - it's all there, and all being "spoken" to. When there is no tape in the drive, the Device Status in Retro correctly says it's Empty.

 

However, whenever I put a tape in the drive in the hope of initialising it and running a backup, the Activity light flashes for a bit on the drive, and then in Retro, in the Devices tab, it registers as being Busy, then Running ... and then it hangs indefinitely, with a Beach Ball of Death spinning. I left it for a couple of hours on Friday with the Beach Ball spinning, and the only externally noticeable effect was that the server was running slower than normal. Every now and again it would flick back to saying the drive was Empty, but then it goes back to saying it's Running again with the Beach Ball. When I eject the tape, it's fine - Retro goes back to registering the drive as being Empty. However, if I try to Force Quit, the whole server goes down and asks for a hard restart (one of those "nice" kernel panics which tell you to hold down the power button & restart).

 

I've tried forcing an Immediate Backup without initialising a tape first, and it goes through all of the motions just fine until it gets to the point where it starts to communicate with the tape drive - the Beach Ball reappears and I have to eject the tape and stop the execution. So everything's peachy until the server and Retro try to talk to the tape in the drive - they talk to the drive just fine - it's when they talk to the tape that it all goes skewy. And I've tried three different and brand new tapes.

 

I've spoken to the vendor of the drive (it's a Certance drive, but Quantum now owns Certance, so it's all confusing) and they actually told me after an hour of telephone trouble-shooting that the drive was not compatible with the software I'm running - that is, they hadn't tested it and couldn't support it. However, as mentioned, your support pages tell me it's a completely tested and compatible drive with both Retrospect and Macintosh.

 

The more I type, the more convinced I am it's a hardware issue, but I usually get great help from the Retro people (Nate, are you around? ;-p) that I thought I'd try here first. I saw in the other post on a similar issue that an ATTO UL4 card was recommended - would that be a good starting point for now? It's a brand, brand new drive attached to an older XServe ... should I just upgrade as much hardware as I can and cross my fingers?

 

Sorry for the long post - many thanks!

 

Claire.

Link to comment
Share on other sites

  • 2 weeks later...

Hi all,

 

Apologies for the late reply.

 

Have sourced an ATTO UL4D card, and it's fixed my initial problem of Retrospect crashing (BBOD styles) when trying to access the tape in the drive.

 

Last night, I set the backup off for its first, full cycle, and all seemed to be going fine, but when I came in this morning, Retro had crashed, and the server had stalled with it. Our files sit on a RAID which hangs off the same SCSI card as the backup drive (UL4D - dual channel), and when I force quit Retro and double clicked on the RAID volume on the Desktop, the whole server crashed - which indicates to me it could be a SCSI card or SCSI channel issue.

 

When I backup a local volume i.e. an internal drive on the XServe, it's fine - screamingly fast - 1500MB/min on average! So we know it's working, just not on SCSI channels.

 

I've checked the logs of both Retro and the Xserve, and Retro gives no indication anything happened, however the XServe system log is alluding to some sort of issue - the same line is repeated over and over again:

 

Sep 20 19:06:53 localhost /Applications/Retrospect 6.1/Retrospect/Contents/MacOS/AuthenticateUser.app/Contents/MacOS/../../../Retrospect: LaunchApplication(øˇ• )

Sep 20 19:07:43 localhost kernel: jnl: flushing fs disk buffer returned 0x5

Sep 20 19:07:58 localhost /Applications/Retrospect 6.1/Retrospect/Contents/MacOS/AuthenticateUser.app/Contents/MacOS/../../../Retrospect: LaunchApplication(øˇå)

Sep 20 19:08:35 localhost /Applications/Retrospect 6.1/Retrospect/Contents/MacOS/AuthenticateUser.app/Contents/MacOS/../../../Retrospect: LaunchApplication(øˇçP)

Sep 20 19:08:43 localhost kernel: jnl: flushing fs disk buffer returned 0x5

Sep 20 19:09:13 localhost kernel: jnl: flushing fs disk buffer returned 0x5

 

...

...

...

 

Sep 20 21:21:02 localhost /Applications/Retrospect 6.1/Retrospect/Contents/MacOS/AuthenticateUser.app/Contents/MacOS/../../../Retrospect: LaunchApplication(øˇå)

Sep 20 23:31:36 localhost ntpd[249]: time set -0.953292 s

Sep 20 23:31:36 localhost ntpd[249]: synchronisation lost

 

There's not really a lot in there I can take from this log!

 

I'm going to speak to ATTO too - just wondering if anyone's seen/experienced this before?

 

Thanks all!

 

Claire.

Link to comment
Share on other sites

Quote:

Our files sit on a RAID which hangs off the same SCSI card as the backup drive (UL4D - dual channel)

...

just wondering if anyone's seen/experienced this before?

 


While it's clear that the RAID and backup drive are both on the UL4D, it's unclear from your post whether they are on the same channel. It's also a bit unclear just where the hang occurred.

 

Here's my experience, which is with a configuration that is similar to yours in some ways and different in some ways, too. Perhaps some common thread will develop that could help troubleshoot this problem ...

 

We have an Xserve G5 2.0 GHz single processor with an ATTO UL4D. We also have a Seagate Cheetah 10,000 RPM LVD SCSI drive attached to the UL4D, and that Seagate Cheetah is a SoftRAID RAID 1 mirror primary of our OS boot volume, with the mirror secondary being one LUN of a RAID 5 on the Xserve's Apple Hardware RAID card, controlling 3 x 250 GB ADMs in the Xserve G5.

 

Also attached to the ATTO UL4D is an Exabyte VXA-2 1x10 1u PacketLoader (LVD SCSI). When the Exabyte was on the same channel of the ATTO UL4D as the Seagate Cheetah, the system would sometimes (perhaps once a week) hang when Retrospect did its device scan after the first scripted backup source scanning had completed, just as it went to do the first write to tape. Happened on 10.4.2 Server, 10.4.3, 10.4.4, 10.4.5, all updates of Retrospect until the workaround was found, but the occurrence was more frequent with 10.4.4 and 10.4.5 server. Only cure was to move the Retrospect backup device (Exabyte) to its own channel of the ATTO UL4D, leaving the OS disks away from Retrospect's device scan.

 

The problem was hard to reproduce (perhaps once a week), and we tried many things. But the problem cured immediately and has not returned as soon as the Retrospect backup device was on its own channel of the UL4D. My suspicion is that some part of Retrospect's device scan caused some sort of a SCSI bus reset on that channel of the UL4D, causing a disk transfer interrupt or seek interrupt to be dropped. The timing of the reset might happen at the right (wrong) time to cause the dropped disk interrupt, so the hang really is because paging, etc., is blocked on I/O, and process swap can't happen.

 

Again, nothing else changed except to move the Retrospect backup device to its own channel of the UL4D.

 

If you are doing a "file" backup set to the Retrospect backup disk, seems doubtful that you have the same issue as me because Retrospect would be using FS syscalls, just like other writes to the files drive. But if you are treating the Retrospect backup disk as a device, it could be related. You don't provide much detail about the destination for the backup.

 

Just something for you to think about and try.

 

Russ

Link to comment
Share on other sites

Hi again,

 

Thanks so much for your thorough response! I have to admit you're at a MUCH higher level than I am with all of this, and there was a bit of your e-mail I didn't understand ...

 

To start with, more on my configuration:

 

XServe G5 2.0GHz dual-processor with an ATTO UL4D (also has a VGA card in another PCI port). Attached to the UL4D are the Quantium/Certance LTO-3 drive, and an Arena EX RAID, configured as RAID-5. XServe is running 10.3.9, and all software and device drivers have been updated.

 

-------------------------

 

Additionally, more on how Retrospect is configured:

 

Not sure what you mean about doing a "file" backup - I would assume that all backups are "file" backups? What we're doing in here is grandfather/father/son rotation cycle, and we're backing up all files from certain volumes on a couple of PC servers we have in here, the entire contents of the RAID (about 250GB) and some files contained on the internal drives on the XServe. What else can I tell you? You say

Quote:

But if you are treating the Retrospect backup disk as a device, it could be related.

 


- the backup drive is seen in the Devices tab of Retrospect - does this mean it's being treated as a "device" by Retrospect?

 

 

-------------------------

 

Next up, with regards to channels, you say,

Quote:

it's unclear from your post whether they are on the same channel

 


... this is my own ignorance, as I'm unclear about "channels" on a dual-channel SCSI card. My assumption was that if there are two ports on a SCSI card, then these are the two channels, and if there's a different device plugged into each input port of the card, then they're on different channels. Does that make sense? Are you able to further explain the channels and how I'd find out if the devices are on separate channels? You've mentioned that two devices on the same channel caused problems, so it's definitely something I need to look into.

 

-------------------------

 

I think that's it for now! Sorry for my ignorance on some of this stuff - I'm a new tech kid - haven't dealt too much with SCSI!

 

Many thanks,

Claire.

Link to comment
Share on other sites

Quote:

Not sure what you mean about doing a "file" backup - I would assume that all backups are "file" backups?

 


No, it's a term for whether the backup on the destination is stored as a big file on a filesystem, or whether Retrospect is using the raw device (as it would with a CD, etc.). See the manual on this.

Quote:

I'm unclear about "channels" on a dual-channel SCSI card. My assumption was that if there are two ports on a SCSI card, then these are the two channels, and if there's a different device plugged into each input port of the card, then they're on different channels.

 


Yep, you've got the right model.

Quote:

Are you able to further explain the channels and how I'd find out if the devices are on separate channels?

 


If you've got them onto the separate connectors on the UL4D, then they are on separate channels. You can see what is where by looking in the "About this Mac" / "More Info" (which runs System Profiler), Parallel SCSI. Here's a snippet of our report for our Xserve:

Quote:

PCI Cards:

 

LSI,523:

 

Name: LSILogic,raid

Type: scsi-2

Bus: PCI

Slot: SLOT-2

Vendor ID: 0x1000

Device ID: 0x1960

Subsystem Vendor ID: 0x1000

Subsystem ID: 0x4523

Revision ID: 0x0001

 

ATTO ExpressPCI UL4D:

 

Name: ATTO,ExpressPCIProUL4D

Type: scsi-2

Bus: PCI

Slot: SLOT-3

Vendor ID: 0x117c

Device ID: 0x0030

Subsystem Vendor ID: 0x117c

Subsystem ID: 0x8013

ROM Revision: 1.5.0

Revision ID: 0x0008

 

ATTO ExpressPCI UL4D:

 

Name: ATTO,ExpressPCIProUL4D

Type: scsi-2

Bus: PCI

Slot: SLOT-3

Vendor ID: 0x117c

Device ID: 0x0030

Subsystem Vendor ID: 0x117c

Subsystem ID: 0x8013

ROM Revision: 1.5.0

Revision ID: 0x0008

 

bcom5704:

 

Type: network

Bus: PCI

Slot: SLOT-4

Vendor ID: 0x14e4

Device ID: 0x1648

Subsystem Vendor ID: 0x106b

Subsystem ID: 0x005a

Revision ID: 0x0003

 

bcom5704:

 

Type: network

Bus: PCI

Slot: SLOT-4

Vendor ID: 0x14e4

Device ID: 0x1648

Subsystem Vendor ID: 0x106b

Subsystem ID: 0x005a

Revision ID: 0x0003

 

 

Parallel SCSI:

 

SCSI Parallel Domain 0:

 

Initiator Identifier: 7

 

SCSI Target Device @ 16:

 

Manufacturer: MEGARAID

Model: Logical Drive 0

Revision:

SCSI Target Identifier: 16

SCSI Device Features: Wide, Sync

SCSI Initiator/Target Features: Wide, Sync

Peripheral Device Type: 0

 

SCSI Logical Unit @ 0:

 

Capacity: 50 GB

Manufacturer: MEGARAID

Model: Logical Drive 0

Revision:

Removable Media: Yes

Detachable Drive: No

BSD Name: disk5

OS9 Drivers: No

SCSI Logical Unit Identifier: 0

S.M.A.R.T. status: Not Supported

 

SCSI Target Device @ 17:

 

Manufacturer: MEGARAID

Model: Logical Drive 1

Revision:

SCSI Target Identifier: 17

SCSI Device Features: Wide, Sync

SCSI Initiator/Target Features: Wide, Sync

Peripheral Device Type: 0

 

SCSI Logical Unit @ 0:

 

Capacity: 417.42 GB

Manufacturer: MEGARAID

Model: Logical Drive 1

Revision:

Removable Media: Yes

Detachable Drive: No

BSD Name: disk7

OS9 Drivers: No

SCSI Logical Unit Identifier: 0

S.M.A.R.T. status: Not Supported

Volumes:

disk7s3:

Capacity: 417.29 GB

Available: 400.8 GB

Writable: Yes

File System: Journaled HFS+

 

SCSI Target Device @ 63:

 

Manufacturer: MEGARAID

Model: MEGARAIDDUMMYDEV

Revision: 1.00

SCSI Target Identifier: 63

SCSI Device Features: Wide, Sync

SCSI Initiator/Target Features: Wide, Sync

Peripheral Device Type: 31

 

SCSI Logical Unit @ 0:

 

Manufacturer: MEGARAID

Model: MEGARAIDDUMMYDEV

Revision: 1.00

SCSI Logical Unit Identifier: 0

 

SCSI Parallel Domain 1:

 

Initiator Identifier: 7

 

SCSI Target Device @ 3:

 

Manufacturer: SEAGATE

Model: ST373405LW

Revision: 0003

SCSI Target Identifier: 3

SCSI Device Features: Wide, Sync, DT

SCSI Initiator/Target Features: Wide, Sync, DT

Peripheral Device Type: 0

 

SCSI Logical Unit @ 0:

 

Capacity: 68.37 GB

Manufacturer: SEAGATE

Model: ST373405LW

Revision: 0003

Removable Media: Yes

Detachable Drive: No

BSD Name: disk3

OS9 Drivers: No

SCSI Logical Unit Identifier: 0

S.M.A.R.T. status: Not Supported

 

SCSI Parallel Domain 2:

 

Initiator Identifier: 7

 

SCSI Target Device @ 0:

 

Manufacturer: EXABYTE

Model: VXA 1x10 1U

Revision: A10D

SCSI Target Identifier: 0

SCSI Device Features: Wide, Sync

SCSI Initiator/Target Features: Wide, Sync

Peripheral Device Type: 8

 

SCSI Logical Unit @ 0:

 

Manufacturer: EXABYTE

Model: VXA 1x10 1U

Revision: A10D

SCSI Logical Unit Identifier: 0

 

SCSI Target Device @ 1:

 

Manufacturer: EXABYTE

Model: VXA-2

Revision: 210E

SCSI Target Identifier: 1

SCSI Device Features: Wide, Sync

SCSI Initiator/Target Features: Wide, Sync

Peripheral Device Type: 1

 

SCSI Logical Unit @ 0:

 

Manufacturer: EXABYTE

Model: VXA-2

Revision: 210E

SCSI Logical Unit Identifier: 0

 

 


 

The "Parallel Domain" entries are the different channels. Parallel Domain 1 and 2 are on the ATTO UL4D. I assume that you've got the correct termination, etc. Have to get that stuff right, you know.

 

russ

Link to comment
Share on other sites

Hi Russ,

 

Thanks for getting back to me.

 

Am still no nearer a solution, unfortunately. Have tried the backup drive again, and it inevitably stalls in the wee hours of the morning while leaving very little information in the logs on how or why!

 

With regards to whether our backup is a "file" backup, I'm sure that our backup system is writing to the tapes as if they were a CD, i.e. each individual file is written to the backup device as opposed to one big snapshot being taken and copied to the backup device. Am I on the right path? Backup 101!

 

I've tried writing a couple of scripts, to see whether it's the way the script was configured which is causing the stall. We backup the following data:

 

1. External RAID disk (250GB approx)

2. Internal partition (20GB approx)

3. Windows 2000 server contents over a network

 

The internal partition and network drives are not an issue - have backed up each of these separately with no problem.

 

However, when I try to back up the external RAID, this is when the stall occurs. Hence, my assumption it's a SCSI issue. The RAID is sliced into about 20 partitions/volumes, so I've tried the following script configs:

 

1. Backing up the RAID as one disk, all 250GB of it, in one hit;

2. Backing up the RAID, partition by partition, so that Retrospect keeps "refreshing" its contact with the RAID and the drive.

 

As you've confirmed that my SCSI devices are indeed on separate channels, we know that can't be the cause of this issue.

 

Am going to post a form on ATTO's site with a link to this thread to see if they can shed any light. It's a most frustrating issue!

 

Many thanks,

Claire.

Link to comment
Share on other sites

Quote:

With regards to whether our backup is a "file" backup, I'm sure that our backup system is writing to the tapes as if they were a CD, i.e. each individual file is written to the backup device as opposed to one big snapshot being taken and copied to the backup device. Am I on the right path? Backup 101!

 


 

Must be my confusion, here, not yours. At one point in your sequence of posts it appeared that you were going to try the backup as a file backup (to disk) rather than to tape. If you are only backing up to tape, then this is a non-issue. You are on the right track.

 

It appears that your initial post indicated you had an ATTO UL3D, correct? Your subsequent posts, and your system profiler info, say UL4D. So you've switched to UL4D and you are still getting hangs?

 

Russ

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...