Jump to content


Photo

Workaround for -530 error inspired by the Skull Island tradition


  • Please log in to reply
5 replies to this topic

#1 David Hertzberg

David Hertzberg

    Occasional Forum Poster

  • Members
  • 410 posts
  • LocationNew York, NY

Posted 08 February 2017 - 10:34 AM

The tl;dr summary is that I have been able to use  the No Files selector (now known as a Rule) to provide a rather-kludgey solution to my new -530 problem.  I have now tested the solution for a couple of nights, so I consider it ready for posting it on the “Retrospect 9 or higher for Macintosh” forum (I let Mayoff know in an e-mail yesterday, since he had belatedly offered No Files advice) :

 
Monday night 30 January I had to make an emergency replacement of the 8-port Ethernet switch in my study with another model of a different brand.  If you’re interested, https://arstechnica....57787#p32757787 describes my setup in the second paragraph.  As the third paragraph says, I immediately started getting -530 errors on my nightly “Sun.-Fri. Backup” No Media Action script when I booted my “backup server” after the 3 a.m. scheduled time for  the script—a problem that had mysteriously vanished last June (see the thread http://forums.retros...port/?hl="-530" for my past -530 history).
 
A poster on the Ars Technica thread commented “Sounds like retrospect is dumb and trying to run before the network is 100% functional. See if you can just put a pause in the beginning of the script or tell it to do a ping to the target computer before starting the backup attempt. I've seen similar problems. The switch just switches but from one bit of hardware to the next, the time for the link to become active and start passing traffic might be a few seconds different and that might be enough for poorly made software to error out.”
 
The comment reinforced suspicions I had had last spring, but I replied that I don’t know to put a pause or a ping in the beginning of a Retrospect script.  I then remembered selectors (now known as Rules), and wondered if there is one that would enable me to run a sacrificial script before the real “Sun.-Fri. Backup”.  My idea was that, even if the sacrificial script didn’t make the Retrospect Engine “straighten up and fly right”, I would  have time while it ran to click the Locate button in Sources—an action which has always eliminated any further -530 error during that Engine execution.
 
It turns out that a “NoOp Sun.-Fri. Backup” script, using the No Files selector (now known as a Rule) and scheduled for 3 a.m., makes my “Sun.-Fri. Backup” script run without error when scheduled at 3:10 a.m.  When I boot my “backup server” before or just after 3 a.m., both scripts run fine.  When I boot my “backup server” well after 3 a.m., the sacrificial script bombs with a -530 error but my “Sun.-Fri. Backup” script immediately runs fine.
 
At some point I used the Forum's search capabilities to find hgv's post http://forums.retros...nd/#entry262378.  It suggests creating a launchd script to do a 60 seconds delayed stop/restart of the retroengine and retroisa daemons at startup.  However I don't know how to create a launchd script.
 
I’ve also posted a version of this as a Support Case, since this solution shows an obvious bug in the Retrospect Engine.
 
BTW, the reason all links in this post up to this paragraph are not concealed in Link constructs is that I wanted to include them in the Support Case, where Link constructs are not allowed.
 
The "Skull Island tradition" in the thread title refers to the plot of the 1933 version of the movie "King Kong".  In it the Skull Islanders have developed over the generations an effective way of keeping the giant ape on the other side of the ancient wall; they occasionally give the giant ape a young woman as a sacrifice (it is the white outsider Carl Denham who interferes with this tradition and causes a catastrophe for the islanders).  Hence the idea for my "sacrificial script" (insert appropriate smiley here).
 
P. S.: The Support Case is now posted as of early this morning.  The Retrospect Support Team has already asked for a copy of my operations log, which I've given them.
 
P.P.S.: Note that for both the just-after-3-a.m. and the well-after-3-a.m. runs the Search Timeout in the Advanced dialog in the Console's Network Preference pane had been increased from 10 seconds to 75 seconds. It didn't make any difference in whether the "sacrificial" script ran or not.
 
P.P.P.S.: Fixed last sentence of fourth paragraph; the Locate button is of course in the Sources category of the Console.
 
P.P.P.P.S.: A "selector" is now a Rule in Retrospect Mac.


#2 David Hertzberg

David Hertzberg

    Occasional Forum Poster

  • Members
  • 410 posts
  • LocationNew York, NY

Posted 17 February 2017 - 11:16 AM

It's apparently not just a bug in the Retrospect Mac 12.5 Engine; it may also be a bug in the Retrospect Mac 12.0.2(116) Client.

 

Wednesday night 15 February I had a bright idea; I would experimentally demonstrate that it was my new 8-port switch that was causing the -530 problem by temporarily eliminating it from my LAN. So before going to bed I cabled my MacBook Pro directly to the MoCA adapter in the study, entirely bypassing the switch (Internet, shminternet, who needs it if all you're going to do is a Retrospect backup?).  I then put the MBP to sleep.  Around 4:00 a.m. Thursday I awakened, woke up the MBP, and booted the Mac Pro "backup server" in the bedroom.  To my surprise, the sacrificial script bombed with a -530 error but my “Sun.-Fri. Backup” script immediately ran fine.  Thursday night I again cabled my MacBook Pro directly to the MoCA adapter in the study, but this time left both the MBP and the Mac Pro awake.  When I awakened around 4:30 a.m. today (Friday), I found that both scripts had run fine starting at the scheduled 3:00 a.m..  These two experiments proved that having my new 8-port switch on the Retrospect-applicable LAN is no longer necessary to activate the problem.

 

IMHO what this implies is that changing the 8-port switch in my study two weeks ago Monday activated a bug in the Retrospect Mac 12.0.2(116) Client (which I use on my MBP because it's what I had reverted to when the -530 problem mysteriously disappeared last June), which in turn activated a bug in the Retrospect Mac 12.5 Engine.  In fact it may have been my old 8-port switch going bad that activated a bug in the Client.



#3 David Hertzberg

David Hertzberg

    Occasional Forum Poster

  • Members
  • 410 posts
  • LocationNew York, NY

Posted 20 February 2017 - 12:38 PM

Retrospect Support wrote back about my Support Case on Friday 17 February: “Engineering would like you to try out a test build which contains some substantial changes to the network code. While you are testing things out, please try not to change too many variables, otherwise the test results may be less impactful. What is most important however, is that your environment is still configured such that the client backups require the sacrificial No Files script in order to succeed. 

 

The changes in this build were not targeted at your problem in particular, but since they have made a large amount of changes since v12.5, they would like to see if the issue you are currently experiencing hasn’t been fixed by proxy.”

 

However, per my last post, I now think we're talking about two bugs related to the -530 error: 

 

(1) The Retrospect Engine software, when it is booted as a Startup Item and must immediately execute a scheduled script, gives a -530 error if the first client the script says to backup is "difficult to establish connection with".  However, once that first script—in this case the "sacrificial script" using the No Files selector (now known as a Rule)—has failed with a -530 error, a second scheduled script executing immediately after the first script has no problem backing up that same client that is/was "difficult to establish connection with".

 

(2) The Retrospect Intel Mac Client software can, either because of a change in LAN hardware or something to do with other software on a client machine, "go bad" and make a client machine become and stay "difficult to establish connection with".  The LAN hardware change case is what I have encountered starting 2.5 weeks ago; the other-software-on-the-client case is what I encountered last spring, and what other Retrospect administrators have encountered.

 

IMHO the only way to see if bug (1) has been fixed in the test build is to leave everything the same for my LAN hardware and for my MacBook Pro client.  Therefore I proposed to at first only install the test build server software, and to leave the Retrospect Client software on my MBP at its present 12.0.2(116) installed version.  I did, however, disable (I will later re-enable) the No Files "sacrificial scripts”, since disabling/re-enabling them only involves the Mac Pro "backup server".  Later, after I have done what we agree is a satisfactory amount of testing for bug (1), I can also install the test build client software—which I assume will require the customary Remove Client-Uninstall&Reinstall Client-Add Client dance—and test for bug (2).

 

I also proposed to postpone the installation and use of the test build until Sunday February 19.  That is because my "Sat. Backup" scheduled script does a Recycle Media Set backup of all 6 of my drives, and takes 11 hours.  If "Sat. Backup" got a -530 error on my MBP and skipped to the next machine on Saturday February 18, I’d have had to stop using the MBP for 5.75 hours in the late morning and early afternoon while an emergency No Media Action run backed it up.

 

I added the 5 paragraphs immediately above to my Support Case after lunch Friday (NYC time), and asked that they let me know that day (California time) if they had any objection.  I got no reply, so I downloaded and installed the test build (a late version of Retrospect Mac 13.5) only on my Mac Pro "backup server" Saturday evening after my “Sat. Backup” script had finished, and I disabled the No Files "sacrificial scripts”.

 

I booted my MBP at 3:15 a.m. Sunday, and booted my Mac Pro “backup server” at 3:25 a.m.—15 minutes after my “Sun.-Fri. Backup” script was scheduled to run.  “Sun.-Fri. Backup” bombed with a -530 error.  When I manually re-submitted it, it ran fine.

 

Shortly after midnight Monday morning, before going to bed, I cabled my MBP directly to the Actiontec MoCA adapter in the study—bypassing the NETgear GS608v4 switch in its LAN connection to the "backup server".  I again booted my MBP at 6:25 a.m. Monday, and booted my Mac Pro “backup server” at 6:40 a.m.—3.5 hours after my “Sun.-Fri. Backup” script was scheduled to run.  Again “Sun.-Fri. Backup” bombed with a -530 error.  Again, when I manually re-submitted it, it ran fine.

 

​P.S.: Tuesday morning booted my Mac Pro "backup server" substantially after 3:00 a.m., with results similar to Sunday morning.  Wednesday morning (today) booted my Mac Pro at 2:47 a.m., with my MBP previously awake and in use for about 0:30 hours before I quit all apps at 2:47; both the "sacrificial script" and the “Sun.-Fri. Backup” script ran OK.  IMHO I've exercised the Server test build enough to prove that -530 bug (1) has not been fixed.  Before I go back to bed I intend to add a stiff note to my Support Case.  That note will spell out the cause and cure of bug (1) by quoting hgv and re-quoting an expert on the Ars Technica Networking Matrix forum,  it will warn that my installing the test build of the Intel Mac Client will—if Retrospect Engineering's "substantial changes to the network code" are worth a damn—likely eliminate the -530 error in accessing my MBP entirely, and it will offer Engineering a last chance to request further tests on bug (1) before I do that.

 

P.P.S.: In the paragraphs describing bugs (1) and (2), the fourth and fifth paragraphs in this post, changed "difficult to communicate with" to "difficult to establish connection with".  This more-accurate terminology was suggested by Retrospect Support, some days after I submitted my Support Case.

 

P.P.P.S.: A "selector" is now a Rule  in Retrospect Mac.



#4 David Hertzberg

David Hertzberg

    Occasional Forum Poster

  • Members
  • 410 posts
  • LocationNew York, NY

Posted 01 March 2017 - 01:02 AM

After running the test build since February 20th, I have concluded that—despite the "substantial changes to the network code"—nothing in the Retrospect test build fixes either bug (1) or bug (2) as described in the fourth and fifth paragraphs of post #3 above.  However I have conceived and implemented a hardware change that has eliminated my occurrence of -530 bug (2), and has therefore eliminated for me any occurrence of -530 bug (1).  I am therefore posting the results to inform others.

 

What I call -530 Bug (2) is a "difficult to establish connection with" condition relating to my MacBook Pro that is not one of the conditions listed in the Knowledge Base article. The conditions listed in that KB article would make my MBP "impossible to establish connection with", and that would be true whether or not the Engine runs a script on or significantly after its scheduled time and whether or not the Engine has just been started.

 

For hardware background: My Early 2011 MacBook Pro in the study and my 2010 Mac Pro "backup server" in the bedroom have gigabit Ethernet. The Actiontec ECB2500C MoCA adapters on both ends of my inter-room RG-59 cable deliver speeds up to 270Mbps. However, prior to 31 January, the D-Link switches cabled between both MoCA adapters and the respective computers were 100Base-T switches. Replacing my old D-Link switch in the study with a gigabit switch on 31 January created a potential speed mismatch on my LAN, which I have been assured Ethernet standards should resolve. The standards have in fact resolved them, as proven by the ability of Retrospect to back up my MBP since 31 January and prior to this morning once it has achieved a "can establish connection with" state.

 

I installed the test build Client software before my "Sun.-Fri. Backup" run on 23 February and did the Remove Client->Uninstall&Reinstall Client->Add Client dance.  I had on a previous day installed the test build Engine software, about which more in the next post.  Having had the MBP booted for 15 minutes early on 23 February, I booted my Mac Pro "backup server" at 3:30 a.m. NYC time—0:30 hour after the two "Sun.-Fri. Backup" scripts were scheduled to run in succession. The No Files "sacrificial script" bombed with a -530 error, and the normal script ran without error.

 

On 24 February I cabled my MacBook Pro directly to the MoCA adapter in the study, entirely bypassing my new 8-port switch. My MBP had already been awake and in use for about 15 minutes. I then booted my Mac Pro "backup server", and the "sacrificial script" immediately started to run at 3:36 a.m.—36 minutes after its scheduled time. Unexpectedly the "sacrificial script" ran OK, as did the "Sun.-Fri Backup" script that ran immediately after the "sacrificial script" was finished.

 

It then occurred to me that, given the unexpected result of the "Sun.-Fri. Backup" run on 24 February, it might be worthwhile to make a scheduled run with my new 8-port gigabit switch connected to the MoCA adapter in the study—but with my old 5-port 100Base-T switch not connected to the MoCA adapter in the bedroom.  However, in a pair of scheduled runs that started at 5:36 a.m. on 25 February—2.5 hours after their scheduled time, the "sacrificial script bombed with a -530 error but the real "Sun.-Fri. Backup" script ran OK.

 

Therefore, later on 25 February, I put in a mail order for a new gigabit 5-port switch.  The new switch arrived late in the evening of 27 February, and I used it  to replace the 100Base-T switch in my bedroom.  Thereupon both the "sacrificial script" and my regular "Sun.-Fri. Backup" script ran OK on 28 February starting at 5:12 a.m., 2.2 hours after they were scheduled to start. I did not do the software drop-and-add-client dance.

 

Replacing the switch in my bedroom before the 28 February run eliminated the switch speed mismatch.  Theoretically there is still a mismatch between the maximum speed of both my switches and the maximum speed of both my MoCA adapters.  Evidently that speed mismatch doesn't cause a Retrospect -530 bug (2), probably because the adapters are network bridges (I really know very little about networking).

 

So, if your LAN is generating -530 errors on backup script runs, consider the possibility of a hardware speed mismatch.  However, remember that there can be software "speed mismatch" conditions that also manifest what I call -530 bug (2); I had at least one of those from February to June 2016—see this thread (especially post #11 and following)—until it mysteriously vanished.

 

P.S.: The same run, per paragraph 7, on the morning of 1 March got a -530 error on the "sacrificial script"; doing  the software drop-and-add-client dance made it run OK on the morning of 2 March..



#5 David Hertzberg

David Hertzberg

    Occasional Forum Poster

  • Members
  • 410 posts
  • LocationNew York, NY

Posted 01 March 2017 - 05:03 AM

Let's move on to the consequences of my recent results on what I call -530 bug (1).  As I said in the first sentence of post #4 above, these results for me have been the same from 31 January through 27 February—because the test build provided by Retrospect Support has done nothing to fix -530 bug (1).  However, as I said in the second sentence of post #4, as of 28 February I am no longer getting any -530 error because I have fixed the underlying -530 bug (2) condition.

 

Remember that, in order for you to get a -530 error that is not caused by one of the conditions listed in the Knowledge Base article, three things must occur: First, you must have a hardware or software condition on one or more clients that creates a "difficult to establish connection with" condition as far as the Engine is concerned.  Second, the Engine must access that client by running a script immediately after the Engine is booted—an appreciable time (I'm not sure what that time is, but IME it's at least 10 minutes) after the script was scheduled to run.  Third, that script must be the first delayed-past-its-scheduled-start script to use that client as a Source.

 

This IME means that you can avoid getting a -530 error that matters for that client by either making sure the "backup server" machine is booted at least 10 minutes before any scripts are scheduled to run, or by making sure the first script run is a "sacrificial script" for the affected client.  What I have termed a "sacrificial script" is one that uses the No Files selector (now known as a Rule), so that there is no practical effect if the "sacrificial script" runs OK instead of getting a -530 error.  The "sacrificial script" will run OK if the "backup server" is booted at least 5 minutes before the "sacrificial script is scheduled to run, or it will get a -530 error within a couple of minutes if the "backup server" is booted an appreciable time after that script's scheduled start time.

 

​So if your scheduled Backup scripts are getting -530 errors for one or more clients, and you can't find a hardware or software condition—including the ones listed in the Knowledge Base article—that you can fix, I suggest you try scheduling a "sacrificial script" to run 10 minutes before the real Backup script.  You can create a "sacrificial script" by simply making a copy of the real Backup script (I suggest prefixing the copied real name with "NoOp" instead of suffixing it with "Copy"), changing its Rules to No Files, and either moving its Schedule(s) 10 minutes forward or moving the real script's Schedules 10 minutes back.

 

If you try this, please report back in this thread whether it works.

 

P.S.: A "selector" is now a Rule in Retrospect Mac.



#6 David Hertzberg

David Hertzberg

    Occasional Forum Poster

  • Members
  • 410 posts
  • LocationNew York, NY

Posted 08 March 2017 - 05:58 PM

My further testing, at the request of Retrospect Support, indicates that the "sacrificial script" need not be "sacrificial"—meaning it need not get a -530 error—if it doesn't access a client.  

 

I created a test script that backed up from scratch a 42MB Favorite Folder (Subvolume in Retrospect Windows terms) from the _non-boot_ drive on my Mac Pro "backup server" to an extra Media Set (Backup Set in Retrospect Windows terms), and scheduled it before my regular "Sun.-Fri Backup" script for my MacBook Pro client.  I had previously reinstalled my old 100Base-T Ethernet switch in place of my brand-new gigabit switch on my LAN.  When I awakened my MacBook Pro and then booted my Mac Pro two hours after the scheduled start time for the first of the two scripts, they both ran OK.

 

The night before, I had run only my regular "Sun.-Fr. Backup" script for my MacBook Pro client starting 1.5 hours after its scheduled time.  The old 100Base-T Ethernet switch was in place on my LAN, and the script bombed with a -530 error.

 

Once the Retrospect Engineers have recovered from the wild party they undoubtedly had upon shipping v14/v12, I think Retrospect Support should get them started on fixing -530 Bug (1) in the Engine as I have reported it.  The fix will be "just giving the Engine enough time to get things set up properly".






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users