Jump to content
sixx

-102 (trouble communicating) Grooming Issue

Recommended Posts

We are currently running Retrospect MultiServer for Windows (version 7.6.11) on a Windows 2003 Server machine for our backups. We perform a nightly backup over our network to a NAS device to a different physical location for redundancy purposes. Often we receive an error -102 (trouble communicating) during the backup process. Before the latest update, I could simply hit “OK†on the dialog box and the backup would resume and eventually complete. Although the -102 error is a constant annoyance, I learned to live with it. However, after updating to version 7.6.xxx, when I click on “OKâ€, Retrospect attempts to groom the drive it’s copying data to and eventually fails and prompts me to create another folder on the backup set which I do not want to do, so I have no other choice but to cancel the job. The drive it’s attempting to copy data to is a 1TB drive with plenty (over 800 GB) of free space, so why does it need to be groomed and why does the grooming operation fail? Like I said, I have “learned to live†with the constant -102 errors (this is the only application on my entire network that thinks there are communication/network errors) but if my daily transfers off-site can’t continue because Retrospect tries and fails to groom the drive each and every time a job runs then I’d gladly take recommendations for an alternative solution. Backup software should be “set and forget†software and not need constant babysitting and monitoring. :angryred:

Share this post


Link to post
Share on other sites

We are tracking a problem with NAS boxes that may be similar to what you are seeing. You can try to do a catalog rebuild to see if things work better. You can try turning off grooming to see if things get better. Worst case, you may want to downgrade to 7.5 until the problem is fixed.

 

Can you tell us exactly what type of NAS you are using? What file system is it? What brand of NAS?

 

Have you configured Retrospect to use a specific RBU account? If so, what groups is it a member of?

Share this post


Link to post
Share on other sites

Hi Mayoff,

 

I hesitate to try a catalog rebuild becuase a rebuild takes well over a day becuase my initial full backup is over 100 GB and performing a rebuild each day that the backjob fails is simply not feasible given the length of time required to do it.

 

The NAS device I use is a Lacie BigDisk, 1TB in size with SATA drives and XFS file system connected via Ethernet.

 

We log into the NAS device using an account set up on the NAS device itself and we use that account within Retrospect by going to "Backup Sets", highlighting the drive, selecting "Properties...", then selecting the "Members" tab, selecting "Properties" and the "Automatic Logon..."

 

I have not configured an RBU account because we do not use the Exchange or SQL backup agents. All we backup is file shares on our servers and theri system state. The account logged into the server in which Retrospect runs (as well as all other servers that are clients) is a member of the following domain groups within Active Directory:

 

Domain Administrators

Backup Operators

Enterprise Admins

Exchange Domain Servers

Exhange Enterprise Servers

Exchange Organization Administrators

Exchange Recovery

Group Policy Creator Owners

Schema Admins

 

 

Thanks for looking into this.

Share this post


Link to post
Share on other sites

Does the grooming fail with an error or just report Zero KB of data groomed from the backup set? If you do get an actual error, then that could be a sign that a corrupt catalog file is part of the problem.

 

In the end, no version of Retrospect should be reporting a 102 error when writing to the NAS box. This clearly is a sign that you have been having ongoing troubles. Do you see the 102 error if you use a different computer (running version 7.5) to perform backups to the NAS?

 

Share this post


Link to post
Share on other sites

Mayoff,

 

I do not get an error when it attempts to groom the drive; it states that zero kb was groomed. Here's a copy of the events that occur when the -102 error occurs from the operations log:

 

-9/3/2008 3:25:26 PM: Transferring from BackupDrive

Additional error information for Disk Backup Set member "1-AdminDrive",

Can't write to file \\10.10.15.245\backup\Retrospect\AdminDrive\1-AdminDrive\AA000009.rdb, error -1116 (can't access network volume)

Trouble writing: "1-AdminDrive" (808173568), error -102 (trouble communicating)

9/3/2008 5:09:28 PM: Grooming Backup Set AdminDrive...

9/3/2008 5:09:33 PM: Groomed zero KB from Backup Set AdminDrive.

9/3/2008 5:10:07 PM: Execution stopped by operator

Remaining: 226912 files, 73.8 GB

Completed: 13363 files, 3.4 GB

Performance: 33.5 MB/minute

Duration: 01:44:41 (00:01:04 idle/loading/preparing)

 

As for the -102 errors, we copy our backups from one site to another for offsite backup. The remote site is connected via a point to point wireless connection that has around 12 Mbps throughput. We have moved our Retrospect server to a different server to eliminate that as an issue but the -102 errors still occurred. I believe that the wireless link we use may be causing the -102 errors because it periodically re-scans frequencies to ensure link quality. This takes less than a second to occur, but all traffic going over the link stops during that split second. As I mentioned before, I learned to live with the errors because I realize that my network is not your concern, although I do think that Retrospect is too sensitive to network "hiccups" (we run data and VOIP over the wireless link for over 60 people with no problems) so it would be nice if Retrospect could learn to re-establish a network connection on it's own and continue on with the backup/transfer process instead of reporting the -102 error and stopping the job waiting for someone to hit "OK" in a dialog box.

 

Share this post


Link to post
Share on other sites

I noticed in this thread that other users are experiencing similar issues with the -102 error due to momentary glitches in communication with the backup device. Is there any way to turn down the sensitivity or automate the recovery of the streaming data within Retrospect?

Share this post


Link to post
Share on other sites

The 102 error is often a communication or network communication failure. You must have a stable network connection during backup or things will be flakey.

Share this post


Link to post
Share on other sites

As I said earlier, "it would be nice if Retrospect could learn to re-establish a network connection on it's own and continue on with the backup/transfer process instead of reporting the -102 error and stopping the job waiting for someone to hit "OK" in a dialog box."

 

This issue has been brought up few times on this forum without any resolution or hint of Retrospect having the ability to simply resume a network connection during a communication "glitch."

 

We originally purchased Retrospect because of the prominence of the "Dantz" name, claims to have has many features as other products and was at a lower cost than competitors. Given the "support" and performance of the product thus far, I've come to realize the old saying is true, "You get what you pay for." Time to move on........

Edited by Guest

Share this post


Link to post
Share on other sites

Fix the network first, then Retrospect will work.

 

Having Retrospect resend lost network packets just makes the problem worse by adding to the network load.

Share this post


Link to post
Share on other sites

The network is not broken. We run VOIP and televideo across the network without any problems. I don't think backup software should be more latency sensitive than those applications. I'm already looking for an alternative product. If the software developers can't devise a solution within their product to handle this, then I have no other choice but to look elsewhere. I can't babysit my backup job every night. }:(

Share this post


Link to post
Share on other sites
The network is not broken. We run VOIP and televideo across the network without any problems.

But surely that doesn't put as much load on the network that Retrospect does, right?

Networks break under high load, such as when Retrospect runs grooming.

 

For instance, we had a network outlet in an office that seemed to work just fine: File sharing, database access, web browsing, printing etc with no problem. But when Retrospect runs, the network just couldn't take it and failed. Switching to another outlet (in the next office) fixed that problem.

Share this post


Link to post
Share on other sites

No they don't, but we also run Ghost to image machines (multicast). Ghost dumps an 8GB images to our clients in about 18 minutes, but it doesn't "break" like Retrospect. Is that fast enough for you? Does that place just as much load as Retrospect?

 

Why can't anyone admit (and address) the problems with Retrospect instead of blaming the customers, their networks, their machines or their backup devices used in the backup process?

Share this post


Link to post
Share on other sites
No they don't, but we also run Ghost to image machines (multicast). Ghost dumps an 8GB images to our clients in about 18 minutes, but it doesn't "break" like Retrospect. Is that fast enough for you? Does that place just as much load as Retrospect?
Yes and yes. That probably rules out problems with network load.
Why can't anyone admit (and address) the problems with Retrospect instead of blaming the customers, their networks, their machines or their backup devices used in the backup process?

It isn't necessary a problem in Retrospect. If it was, it shouldn't work here in this office.

We use NAS servers, too, and have NO problems with grooming.

 

I'm sorry I can't help you more. There are so many possible causes of failure.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×