Jump to content

Suddenly Can't Get Scripts to Run


Tree

Recommended Posts

I checked in on Retrospect (8.2.0.399) to see how backup scripts were rolling along - I haven't changed anything in a long while, things have been just humming with the same config for a long time, I just need to re-run scripts sometimes when a client fails to connect (error -519 for instance). One script in particular was only making it part of the way through - a Copy Script that would get about 20-30 GB into a 150 GB load, then fail:

 

 

+ Copy using ccProjects at 8/11/14 (Activity Thread 1)
8/11/14 5:18:21 PM: Connected to C|C Server
8/11/14 5:18:21 PM: Connected to iMac-MRR
To volume ccProjects1 on iMac-MRR...
- 8/11/14 5:18:21 PM: Copying Projects on C|C Server
> !Trouble reading files, error -519 ( network communication failed)
8/11/14 6:25:15 PM: Execution incomplete
Remaining: 54724 files, 126.6 GB
Completed: 39900 files, 28.9 GB
Performance: 478.9 MB/minute
Duration: 01:06:53 (00:05:08 idle/loading/preparing)

 

I tried running this script several times yesterday to see if I could get it to complete; then I left for the evening as it was scheduled to run that night. Again, it got underway and then lost network communication after 21 GB, 52 minutes in.

 

That's not the ultra-weird part.

 

The weird part is that today, I can not get it to run the script anymore! Every time I select the script and press the Run button, it depresses but then nothing happens. I have tried making a duplicate of the script, same non-response. Change the activity thread for the script, same non-response. Stopped and restarted the RS8 engine, no luck. Rebooted both my machine (console) and the RS8 server machine (engine), still nothing. Tried other scripts, they won't go either.

 

Oddly, though, I did a Restore using Restore Assistant, and that DID work, leaving me with a new RA script (since that's how RS8 likes to work).

 

I have other scripts scheduled to go off tonight so I will see whether those actually happen based on schedule. But I cannot manually get any script to run right now. The last script I was able to manually run was one that I did this morning; it completed with some errors, which was to be expected since I was running it during the work day while files it would be backing up were in flux (errors were mostly of the "file didn't compare" variety). That script did actually complete, to wit:

 

*snipped*

 

8/12/14 11:16:28 AM: 14 execution errors
Total performance: 1,199.4 MB/minute
Total duration: 01:55:21 (00:03:37 idle/loading/preparing)

 

But since 11:16:28 AM the only thing I have been able to do is Restore Assistant.

 

Now, when I tried to run a script and discovered the non-response, I went and looked at the log. I found a bunch of repeated lines like this:

 

*snipped*

TFile::Open: UCreateFile failed, /Library/Application Support/Retrospect/ConfigBackup, oserr 21, error -1011
TFile::Open: UCreateFile failed, /Library/Application Support/Retrospect/ConfigBackup, oserr 21, error -1011
TFile::Read: read failed, /Library/Application Support/Retrospect/ConfigBackup, oserr 21, error -1011
TFile::Read: read failed, /Library/Application Support/Retrospect/ConfigBackup, oserr 21, error -1011

*snipped*

 

... with the UCreateFile line repeating 10 times, then the Read line repeating 5 times, Ucreate another 10, Read another 5, Ucreate another 10, and Read another 5 times. I went over to the server machine and looked up the file in question; it was actually a folder, containing a very old restore of my backup of RS8's Config.Dat and Config.Bak files. All files in that folder were date-stamped 2009; their equivalents in Library/Application Support/Retrospect/ bear more current modification dates. So, I surmised that I don't really need that folder, so I renamed it, and also moved it to a different location. Stopping, restarting RS8, and checking the log confirmed that folder to be the one in question, as the UCreate and Read messages vanished once that folder was moved away. But, no effect on the non-running scripts!

 

So I placed that folder back where I got it, renamed it to what it had been, and sure enough the log now shows the same error messages again. I have no idea whether this is a related issue or not; my guess is that it is totally unrelated, and since I don't need the ancient 2009 restore it would be good to just get rid of it.

 

System Information:

Retrospect 8.2.0.399

engine running on iMac 24" with OS 10.5.8 w/ 2.8 GHz Core 2 Duo, 4 GB DDR2 SDRAM

console running on iMac 27" with OS 10.6.8 w/ 2.8 GHz Core i5, 4 GB DDR3

wired Ethernet all 10/100 routers/hubs/switches, no wireless enabled

 

Anybody got any ideas?

  • Like 1
Link to comment
Share on other sites

Update:

 

The regularly scheduled Backup scripts did go as planned, except that one of them produced a "Client Reserved" error. Clue! As it happens, all of the Copy scripts that were failing to initiate were ones that use the same client. Basically, I just wasn't getting the "Client Reserved" feedback to know what was going on.

 

Sure enough, examining that client showed its status as In Use for the script that I had originally tried to run, about noon yesterday. By Command+Clicking the client "Off" to get to status "Not Running", then turning it back on, I was able to launch the Copy script this morning.

 

So now I may be back to just trying to figure out why it fails after copying 20 GB, which is a different issue.

Link to comment
Share on other sites

-519 network communication errors can be hard to troubleshoot, as the problem can occur anywhere between the host and the client. That being said, the symptoms sound like there's a problem at the client.

 

Are there any issues you have noticed on the client machine at the time the network error is generated? Is it asleep, by any chance? Is anything hung? Can it still access the network? What is the status of the Retrospect client app?

Link to comment
Share on other sites

I just ran the same troublesome script again twice today. This script is attempting to Copy from one of our shared Server volumes, the one that actually gets the most frequent activity throughout the workday, as most of our project files are on it. Thus, any "real" network/communication errors would be felt by the rest of us, as it would interrupt our active work. That said, maybe we wouldn't notice brief disruptions, since we mostly communicate with the server at the time of opening, saving, and closing files; we don't often need a continuous streaming connection the way the RS8 client does when executing.

 

The first attempt today got as far as 35 GB prior to quitting with Error -519. The second attempt had my hopes up as it made it to 99 GB. I will run it again when I leave tonight; perhaps with everybody gone it might do better, all the bandwidth to itself. This script normally runs at night, anyways; I've just been trying to manually run it while I am here to get back to where I have confidence in the results.

 

As before, each time it gets to that Error -519, it is leaving the Client state as "In use by..." and therefore "Client Reserved" when testing the client by IP address in Console. The Command+Click+Off cycle gets it back to "Ready". As far as any other conditions at the Client machine, I don't know (Client machine is actually a Mac Mini running OS 10.8 Server, and subject volumes are on an external hard drive array hosted by that computer).

 

I am hopeful that I can keep pushing forward, as I've now successfully copied 99 of the 150 GB; I don't think it's actually copying all of that much data, that's just how much is on the volume, and it compares against what has been copied already. I assume that the next time I run it, presumably no data will need to be copied for that first two-thirds (aside from minor file changes that may have occurred during the work day).

 

I can say that I have seen other clients with Error -519s that often seem to clear up when I wake up their computer - even though I have set each machine's preferences to prevent sleep (other than display sleep). We have a mix of iMacs of various vintages. I generally just check RS8's activity to see which (if any) scripts failed with a -519, go wake that machine if necessary, and just re-run the script. But it sure would be nice to get to a better understanding of what's causing all these -519s!

Link to comment
Share on other sites

No luck on letting it run over night according to regular script schedule. It did launch, but then failed with a -519 after only 2 GB. There was another script that tried to run on the same client thereafter, and of course it failed with the -505 client reserved message. I am going to shift focus to a different script, see if I get -519s with any/all scripts to that client, or just this particular one.

Link to comment
Share on other sites

  • 3 weeks later...

I have tried sidelining the Copy scripts that have been failing, thinking that perhaps they are not as efficient at dealing with large amounts of data on the source. I replaced those with regular Backup scripts, to test my theory. Unfortunately, the new Backup scripts are also exhibiting the same symptoms - getting a few gigabytes into execution, then failing with a -519 and leaving the client in the "reserved" state.

 

I have been doing these tests during the work day, and I have not been needing to reboot the server computer that I am trying to back up; these -519 network errors are not showing up as any other form of network interruption. We continue to use the files and folders on the shared volumes without issue.

 

I used Carbon Copy Cloner to replicate the files on the affected volumes onto an off-site disk, so I'm not feeling too vulnerable at the moment, but if this cannot be resolved then I may have to genuinely abandon Retrospect.

Link to comment
Share on other sites

  • 3 weeks later...

Follow up:

 

The Backup scripts have been running well, it seems they were only failing (error -519) while trying to do the initial backup of large quantities of data. Now that they are only grabbing a few megabytes at a time, I am not seeing issues.

 

It seems that the throughput speed is a factor, as I was able to get one huge initial backup to go through by letting it take its time, at 50 MB per minute. It took over two full days to run, but I just let it and it completed perfectly. Other scripts that have run successfully have achieved about ten times that speed. It seems that the failures tend to occur on scripts that attempt to run at 1 GB per min. or more. I'm not sure if there is a way that I can set a speed limit so that scripts don't try to run too fast, if there is then I think I have a workaround.

 

The other workaround has been to just keep re-running the scripts, as they eventually whittle the amount of data to be backed up down to a manageable level.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...