GMRMacBackup Posted September 18, 2008 Report Share Posted September 18, 2008 (edited) Recently I updated our Intel Xserve to Leopard Server 10.5.4. Previous backup durations under Tiger were around 18 hours but occasionally exhibited the same issue. At the start of the backup it indicates approx. 2900 MB/min but slowly degrades to 1300 MB/min. Once that happens Retrospect starts occupying 60 to 95% of the processors. RAM is never peaked, typically has 40% available at all times. The server is not hammered by user activity, typically less than 20 people connect to it. Log snippit: - 9/17/2008 8:37:13 AM: Copying Bank2… 9/18/2008 6:04:52 PM: Execution completed successfully. Completed: 342180 files, 2.5 TB Performance: 1300.2 MB/minute Duration: 33:27:39 (00:13:38 idle/loading/preparing) Details of the system are as follows... Hardware: Intel Xserve 2GHz Dual-Core (4GB RAM, an 8GB upgrade is awaiting installation)Xserve Raid (1.51, formatted as 2 x 2.28TB RAID 5 volumes)Apple 4GB Fibre Channel Card (Firmware 1.3.14.0)ATTO Celerity FC-41XS (3.25 driver installed, connected to tape drive)Dell PowerVault 132T w/ Fibre Channel Software involved: Leopard Server (10.5.4)Retrospect (6.1.230)Retrospect 6.1 Driver Update (6.1.15.101)SuperDuper! (2.5 v84, for drive synchronizing)Dell PowerVault RMU (210F.00002)Dell PowerVault Library (310D.GY004) I am anxiously awaiting the Intel optimized version, I have terabytes of archive files that would need to be transcribed if I was forced to change backup applications. Any assistance would be greatly appreciated. Edited September 18, 2008 by Guest More details available Quote Link to comment Share on other sites More sharing options...
GMRMacBackup Posted September 24, 2008 Author Report Share Posted September 24, 2008 Just a status update, I've installed the 8GB RAM, the server now has 10GB of RAM available. I have also updated the Leopard Server software to 10.5.5 with no noticeable change in operation. Next upgrade is a XServe RAID card in hopes it will improve throughput. Quote Link to comment Share on other sites More sharing options...
rhwalker Posted September 24, 2008 Report Share Posted September 24, 2008 You might want to contact ATTO support to see if there is some tuning that can be done to the parameters on the Celerity FC HBA which, if I understand your configuration specifics, is what drives the tape drive. Quote Link to comment Share on other sites More sharing options...
CallMeDave Posted September 24, 2008 Report Share Posted September 24, 2008 Recently I updated our Intel Xserve to Leopard Server 10.5.4. Previous backup ... under Tiger ... occasionally exhibited the same issue If what you are reporting also occurred under 10.4, then the upgrade to 10.5 is a red herring. > ... backup durations ... were around 18 hours The total time to backup is only relevant if you are talking about the same amount of data. Are you describing observations of new (or Recycle or New Media) backups of the same Source? Obviously incremental backups will take less time. > At the start of the backup it indicates approx. 2900 MB/min but slowly > degrades to 1300 MB/min The tape drive will not be able to write any faster then the speed it sees files. Do you see this slowdown on all Source volumes? Large and small? > Xserve Raid (1.51, formatted as 2 x 2.28TB RAID 5 volumes > Completed: 342180 files, 2.5 TB So "Bank2" is a volume that is larger then either of the RAID volumes? Please describe what your Source for this backup is. > Next upgrade is a XServe RAID card How is the Xserve Raid currently providing RAID 5 volumes without such a card? David (who is reaching, not having such big iron of his own to play with) Quote Link to comment Share on other sites More sharing options...
rhwalker Posted September 25, 2008 Report Share Posted September 25, 2008 Dave, it's a little confusing. The Xserve RAID is a storage box (no longer sold because it was way obsolete and never updated) that attaches to the Xserve via FiberChannel. The Xserve RAID "card" replaces the Xserve's internal controller card, and provides a RAID controller for the Xserve's internal drives. With the Xserve G5, it was done by a PCI card (Apple rebranded LSI Logic Megaraid card) and recabling from the drives to the PCI card. With the Intel Xserve, it is done by a controller board swap. By the way, the Xserve G5 Apple Hardware RAID card has a firmware bug that was never fixed, whereby the RAID card doesn't fully flush its write cache on graceful power down before disconnecting the drives from the bus. This was a very difficult bug for me to find and to develop a repeatable test case. RADAR ID 4350243. Only known workaround is to disable the write cache on the RAID card, which greatly reduces performance. Don't know if the Intel card has the same bug. Be sure to test, GMR, before deployment into production. It's a hard bug to find because it causes mystery garbage blocks from space at random places in the RAID 5 on only some drives. Causes bad data but the RAID does not think that it is degraded. Not a happy camper that Apple never fixed this bug, which LSI Logic fixed in its firmware long ago (for its own card released to the PC market). Russ Quote Link to comment Share on other sites More sharing options...
GMRMacBackup Posted September 25, 2008 Author Report Share Posted September 25, 2008 Thanks for the responses, gentlemen. I'll go into the hardware configuration in a little more depth. The Intel Xserve currently has 3 internal SATA drives, an 80GB system drive and two 750GB drives formatted as a RAID 1 software mirror. Connected via 4GB Apple Fibre Channel card is an Xserve RAID with 14 x 500GB drives, formatted as two 2.8TB RAID 5 volumes (Bank1 and Bank2) because each controller can access only 7 drives at a time. Another 750GB drive is ordered and an Apple Xserve RAID card is on hand for the Xserve itself to convert the volumes from a software raid to a hardware enabled RAID 5 architecture. At 9PM every weeknight an application called SuperDuper! performs a "Smart Update" between Bank1 and Bank2. This replicates changes performed throughout the day on Bank1 to Bank2. The Smart Update typically takes about 20 minutes to do the replication between volumes. Here is an example of the Retrospect activity... The tape libraries are duplicated to another volume for recovery in case the primary hard drive goes down. This typically takes about 2 minutes. With OSX Tiger Server the backups were typically 18 hours, the only time it ran over 24 hours was if I left another application running accidentally like Safari or one of the Server maintenance applications. Now that the backup system is routinely running over 24 hours it requires me to interrupt the backup cycle to perform recoveries from archive tapes when required. To answer CallMeDave's queries... The total amount of data typically is the same, it varies from 2TB to 2.7TB over a monthly cycle. Bank1 and Bank2 are essentially identical volumes replicated prior to backup operations, Bank1 is an online volume and Bank2 is an offline volume specifically maintained for backup operations. The write speed of the tape drive has not changed during this time frame, if it was able to backup 2.7TB in 18 hours previously I would expect it to handle it now. The Xserve RAID has two hardware controllers for the 14 drives within it's enclosure. The Xserve RAID card is an upgrade for the Intel Xserve itself to provide hardware RAID connectivity to the 3 drives it can contain. I suspect there are new interactions between the Leopard OS and Retrospect that are creating the extended backup times. Maybe Apple has changed the way it is emulating the Rosetta routines in Leopard. Whatever it is it has gotten worse since the upgrade but I cannot rollback to the previous version of server software as the Xserve is being used as a System Update server for our connected clients. Quote Link to comment Share on other sites More sharing options...
GMRMacBackup Posted September 25, 2008 Author Report Share Posted September 25, 2008 Just for comparisons sake, here a few log snippets from the backup log, same configuration and backup scripts as listed in the initial post. These are full backups, not incremental. + Recycle backup using Weekend at 8/23/2008 9:16 AM To backup set Weekend-E… 8/23/2008 9:16:39 AM: Recycle backup: The backup set was reset - 8/23/2008 9:16:39 AM: Copying Bank2… 8/24/2008 2:16:17 AM: Execution completed successfully. Completed: 371430 files, 2.7 TB Performance: 2734.6 MB/minute Duration: 16:59:38 (00:13:39 idle/loading/preparing) + Recycle backup using Weekday at 8/25/2008 10:00 PM To backup set Weekday-A… 8/25/2008 10:00:12 PM: Recycle backup: The backup set was reset - 8/25/2008 10:00:12 PM: Copying Bank2… 8/26/2008 3:10:59 PM: Execution completed successfully. Completed: 372567 files, 2.7 TB Performance: 2730.0 MB/minute Duration: 17:10:47 (00:13:49 idle/loading/preparing) + Recycle backup using Weekday at 8/26/2008 10:00 PM To backup set Weekday-A… 8/26/2008 10:00:19 PM: Recycle backup: The backup set was reset - 8/26/2008 10:00:19 PM: Copying Bank2… 8/27/2008 3:19:14 PM: Execution completed successfully. Completed: 372677 files, 2.7 TB Performance: 2712.8 MB/minute Duration: 17:18:55 (00:14:47 idle/loading/preparing) Quote Link to comment Share on other sites More sharing options...
CallMeDave Posted September 25, 2008 Report Share Posted September 25, 2008 Next upgrade is a XServe RAID card in hopes it will improve throughput Throughput of what? If the XServe RAID card is not going to be used to control any Source volumes then it's probably not going to make any difference in this issue. > I suspect there are new interactions between the Leopard OS and Retrospect > that are creating the extended backup times Your original post specifically stated: "Previous backup ... under Tiger ... occasionally exhibited the same issue" It's unlikely to be a Leopard specific issue if you were seeing it, even if only intermittently, under Tiger. > The tape libraries are duplicated to another volume The correct Retrospect terminology is Catalog. Tape libraries are robotic tape handling hardware devices. > Copying Bank2 Have you tried defining a subvolume on the Bank2 volume, and using that as a Source? It would be interesting to know if the hardware is consistently this speed, or only when it's trying to backup the entire volume. > These are full backups, not incremental While not a solution to why you're only getting 1300.2 MB/minute now instead of the 2730.0 MB/minute you were getting last month (and Russ' suggestion of contacting ATTO about the FC controller is still on the table), I have to wonder why you're _not_ performing Normal backups? All that extra wear on the tapes, all the extra backup time, etc. Are your files so large that incremental backups during the week would consume too much media? Quote Link to comment Share on other sites More sharing options...
GMRMacBackup Posted September 25, 2008 Author Report Share Posted September 25, 2008 For CallMeDave: Currently our Archive volume is a pair of software mirrored 750GB drives, the system drive is 80GB and is not mirrored. Shortly after the Leopard update Retrospect locked up the GUI and we were forced to reboot, after which the OS (10.5.3 at the time) was corrupted and required re-installation. It was decided that the hardware RAID card would allow us to mirror the boot drive as well as provide additional storage. Offloading the RAID duties to a dedicated card instead of forcing the computer to deal with data replication was my basis for the 'throughput' comment. I still suspect it is an issue with Leopard Rosetta emulation. Our Tiger configuration exhibited the slow write problem once or twice a month, not enough of an issue to be concerned about. The current configuration has made the slow writes the rule instead of the exception. The examples of the high speed transfers posted earlier were immediately after a server restart on the August 23rd, right after the rebuild of the server mentioned in the previous paragraph. Shortly after that backups began to slow down which prompted the initial post on September 18th. Pardon me for my incorrect usage of the term 'libraries', I'll endeavor to be more accurate in the future. I will change the scripts to look at the folder currently containing the data as a subvolume to see if anything changes. As it is the Bank2 volume contains one folder, which in turn is a 'Share Point' within Leopard Server. It will take just a few minutes to do and will not effect other backup operations. As to your question about using Normal backups, maybe it is time to switch to incremental backups instead of relying solely on full recycle backups. Personally I like the security of having a complete set of tapes every time the backup is run. Tape wear is not a real consideration since 1 of the 2 weekday rotations get pulled and replaced at the end of the month and are stored offsite as a monthly backup. At the end of the year one of the 6 week rotations are pulled and replaced and held indefinitely. For rhwalker: Thank you for your suggestion for contacting ATTO about configuring the card for the tape drive. I have filled out their support form and hope they can provide me a suggestion or two. I pulled the ATTO log and noticed an occasional SCSI error so I want to follow up on that. Quote Link to comment Share on other sites More sharing options...
rhwalker Posted September 25, 2008 Report Share Posted September 25, 2008 For rhwalker: Thank you for your suggestion for contacting ATTO about configuring the card for the tape drive. I have filled out their support form and hope they can provide me a suggestion or two. I pulled the ATTO log and noticed an occasional SCSI error so I want to follow up on that. Why not just give them a call? I've always been able to get through to them, and the support people are very knowledgeable. Russ Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.