Jump to content
Sign in to follow this  
Maser

Another 10.5/6.1.138 issue -- backup server stopped backing up

Recommended Posts

That's a good question....

 

But it's now the next morning.

 

I have left 6.1.126 running all night (after rebooting after installing 6.1.126 and confirming it worked to back up.)

 

When I'm looking at the backup server script now, my machines where the status should be showing "Source" and "ASAP" are now showing "Media" and "ASAP".

 

 

 

Only *one* machine backed up during this overnight period (and one machine -- from the log -- gave me an Error -53 "volume off-line" message and that clients state is at "retry" now) -- I wonder if that has anything to still do with it?

 

This might go with my supposition that if the *program* can't see a client correctly for some reason (ie, when I change the backup date in in the script to ASAP for an *off-line* client) -- that the program doesn't like that (under 10.5) and the Retrospect application eventually goes into a bad state. This is what seems to happen when I get the prompts that the external backup drives "aren't there" -- which I've not seen again because I've stopped changing the backup dates to ASAP manually unless I *know* the computer is on-line.

 

I think I'm going to try downloading the *combo* 10.5.1 and see if that makes any difference. But it appears not to be a question of 6.1.126 vs 6.1.138 at this point...

 

And I don't think it's necessarily a question if I leave the backup server "untouched" that it will continue to function without an eventual need for a reboot -- unless all the stars are lining up correctly.

 

 

So, if I were to suggest a possibility for testing, you *might* be able to get this to reproduce on your end under 10.5.1 if you were to:

 

set up a whole bunch of machines in a backup server script and:

 

1) put some of the clients off-line and force the backup date to ASAP (or 1/1/01) and get the prompting that it's not reaching the client.

 

2) Somehow, put a client in an "error -53" mode and let the backup server script keep trying to hit that client as usual when it does the walkthrough for the clients. I have no idea how you get to the error -53 mode, though...

 

This one might be the key point -- I can't ever recall seeing any of my OSX clients generate an "error -53". I *have* seen that often on my two servers (showing the problem) that backup my PC clients...

 

Maybe that's it? If you can tell me how to force a client into "error -53" mode, I can try to duplicate that here...

 

 

 

 

bleah.

Share this post


Link to post
Share on other sites

Right after Retrospect is reporting media, are you able to access the backup set properties from configure>Backup sets?

Share this post


Link to post
Share on other sites

FWIW -- as expected -- a reboot from the reinstall of 6.1.138 and the 10.5.1 combo updater started backing up clients again.

 

I also purged everything (again) from my library/prefrences/retrospect folder except the "config" file prior to reinstalling 6.1.138

 

Now, the question becomes -- does installing the combo updater make any difference? I'll probably know by tomorrow morning as the majority of my clients are set to start backing up after 9:00 p.m. tonight...

Share this post


Link to post
Share on other sites

Quote:

This might go with my supposition that if the *program* can't see a client correctly for some reason

 


 

 

I don't see why that's your primary supposition, when Mayoff specifically noted that "Media" means that the Retrospect can't see either your Catalog file your your Media member.

 

And I really don't understand why, when you had the machine in its unstable state, you didn't take a moment to poke at the mounted volume (that is being treated by Retrospect as a Removable Device) to confirm that it is behaving properly outside of Retrospect. It would be a fast and easy way to begin to confirm that the problem isn't with the Media (which is, after all, what Retrospect is reporting).

 

 

Dave

Share this post


Link to post
Share on other sites

I did that this morning when it wasn't working -- when Retrospect is showing "Media", I can double-click my external drive and can copy files to it (beyond the one storage set file that's on it...), etc... I confirmed that isn't the case here.

 

The "Media" toggling in the backup script is *not* related to how the Computer itself is seeing/maintaining the connection to the external backup drives. That's completely functional as far as I can tell. Doing this while the script is showing "Media" does nothing to make things seen again.

 

 

Mayoff asked: Right after Retrospect is reporting media, are you able to access the backup set properties from configure>Backup sets?

 

 

I will check that if/when this happens again.

Share this post


Link to post
Share on other sites

36 hours after applying the combo 10.5.1 update, things are still backing up (knock wood)

 

Of course, it means the majority of my clients on my twice-weekly backup won't try again for another 3.5 days, but I'll see where things stand on Monday morning.

 

If that was it, I'll be happy and P.O.ed at the same time...

Share this post


Link to post
Share on other sites

Well, crap...

 

shouldn't have knocked wood...

 

One of my systems is now showing "Media" in the Status column.

 

I can -- at this point -- open the external disk and copy things in the Finder to it. There's nothing to suggest that the external hard disk is having any issues (there's lots of free space on it...)

 

 

Mayoff asked: Right after Retrospect is reporting media, are you able to access the backup set properties from configure>Backup sets?

 

If I go to *stop* the backup server script, I get the dialog box about "Really stop..." -- but it's missing the big yellow "!" triangle! Problem right there!

 

I say "yes".

 

I go to Configure -> backup sets and select my backup set. I get a SPOD -- and then the program quits.

 

 

I relaunch Retrospect. I can go to Configure -> backup sets and select my backup set and get prompted t enter the key (as it's encrypted.)

 

So -- for whatever reason -- the program is definately barfing for some reason.

 

If I look in my log, there was a client with an "error -53" in it during the last 36 hours. This client looks like it must have rebooted after that as it successfully backed up at 9:30 a.m. (after generating 4 error -53 notes starting at 2:15 a.m. when it first attempted to back up.)

 

My *other* computer that has shown similar problems is still chugging along. My 3rd system that backs up my Mac clients is also working just fine.

 

 

Now what?

Share this post


Link to post
Share on other sites

I am backing up to external hard disks: Backup Set Type: Removable Disk

 

Running in backup server mode 24x7.

 

Two servers showing the problem are those that back up my Windows (and a few Linux) clients. Not the Mac client server.

 

I've been doing this successfully (backing up to "removable disk" with catalog compression and DES encryption on) for many, many months prior to introducing 10.5.x. It seeming like I may need to revert back to 10.4.11 at this point unless you can come up with a solution...

Share this post


Link to post
Share on other sites

Quote:

If I go to *stop* the backup server script, I get the dialog box about "Really stop..." -- but it's missing the big yellow "!" triangle! Problem right there!

 


 

The default setting for the Run Control preference is to not "Confirm before stopping executions" so you've changed this to ask for the confirmation dialog box (perfectly reasonable thing to want, but it must have annoyed enough people for them to have finally provided an option to do without it).

 

At this point I'm sure you're tired of testing, and just want to have a working system. But since the "Yes" button is what initiated your crash, I'd deselect this preference and see if you can stop the Backup Server script without getting the problematic confirmation dialog.

 

 

Dave

Share this post


Link to post
Share on other sites

More on this (as I checked remotely over the weekend):

 

So, the backup server is in "media" status again -- I was able to quit the server script again (yellow triangle missing from the confirmation dialog box again).

 

I tried to add a new *Mac* client (which adds to a working machine) -- I get an "error -3271 (unknown)" dialog box after I enter the hostname. SOMETHING TO NOTE: If I just select "type" --> Mac OS X -- it scans for clients (it should find over a hundred), but the "Select a Backup client" window says: "TCP/IP is inactive". Quit Retrospect and verify your TCP/IP settings".

 

If I look at the "backup client database" window -- it's showing me the clients but *without* the icon for each client (just a list of client names/groups.

 

 

If I attempt to configure a client that I know is on-line, I get the same "error -3271" box.

 

If I bring up Activity Monitor (retrospect is still running), it says that it's only using 88.58M of "Real Memeory) and 773.28MB of "Virtual Memory". CPU (with backup server script not running) is a 0.6). Retrospect is the 2nd highest user of "Real Memory" behind "kernal_task" (97M) and above "WindowServer" (28M)

 

 

I *can* ping (through Terminal) the client I was trying to add with Retrospect -- so it's not a question that the Operating system can't see the client -- just the Retrospect program can not see the client.

 

 

My backup server systems are set to receive static DHCP addresses (always have been...) The backup servers *are on-line* while I'm doing this -- I can connect through the network with Safari, etc...

 

 

I quit Retrospect -- and relaunch it (no reboot). At this point the client icons come back in the "backup client database" window.

 

Going to add a client now says: "no backup clients found" (instead of the message about TCP/IP problems). But I *can* now add my client by hostname (which I added to a new "daily" backup script").

 

If I relaunch the backup server, this newly added client backs up (along with the other clients that should have been backed up but weren't because the backup server "stopped". And I can stop the backup server script and the yellow triangle is back when I do that -- note -- this is without restarting the computer.

 

 

What more can I do/provide here?

 

 

 

 

The only additional thing I can try (now) is to change this to a manual IP address. And then try a clean install of 10.5

Share this post


Link to post
Share on other sites

I wonder if you have corrupt config files? With all this strange behavior, it could be possible.

 

I will check on my over the weekend test tomorrow.

 

RObin

Share this post


Link to post
Share on other sites

I would think that -- if it wasn't happening to *two* machines simultaneously...

 

 

Changing the TCP/IP settings to a manual IP address made no difference. I restarted the server around noon on Sunday. By 8:30 a.m. Monday, it was showing "Media" again.

 

Two machines backed up during that time period (but they also both "deferred" themselves.)

Share this post


Link to post
Share on other sites

I had a backup server script with multiple sources (multiple platforms) copying data to an encrypted removable disk backup set with catalog compression turned on. Backup frequency was once every 3 hours. Running 10.5.1

 

The backup ran all weekend. I did not get a Media message and backups appeared to all run successfully.

Share this post


Link to post
Share on other sites

Were you able to set them up so that the clients would "defer" or generate the "error -53" message in the log file?

 

Is this on a 1G RAM computer?

 

DES encryption?

 

external drive connected to the firewire port?

 

Mac mini as the server machine?

Share this post


Link to post
Share on other sites

Quote:

I would think that -- if it wasn't happening to *two* machines simultaneously...

 


 

- What are some of the things that the two machines have in common?

 

- Are things that the two affected machines have in common also shared by the machine(s) that work without issue?

 

- Is the external hard drive/device the same make/model?

Share this post


Link to post
Share on other sites

Yes, I used DES encryption. yes I only have 1 GB of RAM. I don't know if I generated a -53, but I did have a computer reporting "source" mid way into the test.

Share this post


Link to post
Share on other sites

The two machines showing this problem have in common (vs. the one machine *not* showing this):

 

They back up only Windows XP and Vista clients (and one machine backs up 4 Linux clients). The one that has not shown this is only backing up Mac clients. All three machines backup the vast majority of clients twice-a-week. The Mac client machine does back up some clients daily. All three machines backup the internal hard disk once a week.

 

All three machines are compressing the catalog files and doing DES encryption on the data (similarity)

 

All three of my systems are otherwise identical -- exact same models of stock 1G RAM Mac Mini -- and have the exact same external hard drive enclosures/hard disk models attached via Firewire (OWC Mercury Elite Pro enclosures with 750G hard disks.)

 

The machines backing up the Windows boxes have different "selector" scripts than the Mac (difference)

 

*hardware and software* -- the 3 backup servers are identical (with the exception of the one "bad" machine I set to a manual IP address instead of a static DHCP address -- perhaps thinking that the IP renewal was messing up retrospect somehow -- but it's not.)

 

No other apps are running beyond Retrospect. All have the current RDU file. All (now) have the combo 10.5.1 update applied.

 

 

If all 3 were failing, I'd go back to 10.4.11 (which I might end up doing on one of the machines...)

 

 

 

The only other thing I can think that *might* help duplicate this -- I did *NOT* make a new backup set after updating to 10.5.1 -- I'm using (on all 3 machines) the same backup set that was created around 11/1 (under 10.4.10) and I did my Leopard upgrade around 11/7 or so...

 

I was going to do this (make a new set), but I wanted to see if it worked -- it appeared to work fine, so I didn't think a lot of it...

Share this post


Link to post
Share on other sites

I am now in the process of "clean installing" the 10.5.0 (and then installing the combo 10.5.1 and *only* Retro 6.1.138 (no RDU) on one of my two bad machines (keeping my same catalog files and Retrospect preference files.)

 

I'll report on this if it makes any difference.

Share this post


Link to post
Share on other sites

(FWIW -- I was at an Apple seminar on NetRestore today and talked to another admin who I know uses Retrospect...

 

He's seeing the same "Media" problem I am -- He's got a 1.25M RAM G4 Tower (so slightly more RAM).

 

He backs up Mac and Windows clients. Also backs up to external hard disk.

 

Unlike me, he started a *new* backup set when *upgrading* to Leopard (from 10.4.10) about 2 weeks ago

 

Like me, he's compressing his catalogs.

 

Unlike me, he's *not* encrypting the sets.

 

Unlike me, he's got verification turned on (I have it off -- bad practice, I know, but I'm about speed here...)

 

Backup server status showed "Media" for him that he noticed for the first time today (he recalls the last time he restarted the server was one week ago, but quits the *backup server* (not the program) to look at the log more frequently whereas I just let it run and look at the log files sporadically (like if a computer is stuck in "retry" mode...)

 

So, it's not just me having this problem as he's got (some) significant differences in his setup than I do.

 

(This admin said he's going to throw another 512M RAM into his machine to see if this makes any difference...)

 

- Steve

Share this post


Link to post
Share on other sites

FWIW -- the machine I did a clean reformat install of 10.5.1 yesterday has not (as of this morning) shown "media".

 

The other machine that shows the problem *is* currently showing "Media" toggling in status (and missing "yellow triangle" when I stopped the script running -- just like the other one) so I'm going to do a clean install of that machine.

 

My 3rd machine? Still chugging along and working with no problems at all.

 

I almost wish all 3 were showing the problem....

Share this post


Link to post
Share on other sites

Quote:

the machine I did a clean reformat install of 10.5.1 yesterday has not (as of this morning) shown "media".

 


 

- Did you install Retrospect's File System Plugin on any/all/none of the machines?

 

 

 

Dave

Share this post


Link to post
Share on other sites

Well, crap...

 

Now 48 hours later -- the first machine I clean reloaded with 10.5.1 -- is now toggling "media" in the backup server script. So it's nothing to do with an "upgrade" install. Crap! My clean install I made sure time machine is "off", all energy saver settings are "off", the only sharing on is for ARD and SSH. No firewall on.

 

 

CallMeDave asked:

 

- Did you install Retrospect's File System Plugin on any/all/none of the machines?

 

Since this is the first I've heard of this, the answer would be "no". All that's on my clean install machine is 10.5.1 (and all software updates) and Retrospect 6.1.138 (not even an additional RDU) at this point.

 

 

ARGH. And now -- for the *first time* -- my 3rd server -- which has never shown this problem yet and is still an upgraded from 10.4.10 to 10.5.1 machine with the current RDU -- is now showing "media". And no yellow triangle when I stop the backup server script. I'm 3-for-3 now in having failures.

 

CRAP.

 

 

I can not believe that all 3 of my config files on 3 separate machines would be corrupt (nor would I think my collegue must have a corrupt config file as well.)

 

Backup server (here at least) is broken in 10.5.x when using external hard disks as backup destinations with the clients I'm backing up and the computers I'm using. I hate to have to revert to 10.4.11, but it's seeming more-and-more like I'm going to have to do this...

 

Mayoff -- can I somehow send you one of my Retroconfig files so you can try using that (perhaps adding your clients to that file?)

 

The only other stat of note -- all of the files on my external drives are over 1/2 the size of the drive (meaning on the 750G drive, the smallest backup file is 380G at this point. On the machine that *just now* started showing media, that backup file is almost 600G). But, again, under 10.4.10 with my previous backup sets, everything grew that large before starting a new set.) And each system has in the neighborhood of 350-500 "sessions" at this point.

 

So whether or not this affects your ability to reproduce things, I don't know... Maybe keep running your system and adding more sessions/data to what you are backing up and you'll eventually hit the problem?

Share this post


Link to post
Share on other sites

Having an issue like this happen on 3 computer does really sound strange, and I can't image the config would be corrupted on 3 computers.

 

When Retrospect "crashes" while trying to open the backup set, does an assert log get created? If so, then we get you a debug version of Retrospect to capture the assert error in detail.

 

I believe you are using Removable disk backup sets, have you tried this with a File Backup Set, since the failure always happens with "media"?

 

Maybe Retrospect is running low on memory? Catalog compression is really slow on the Mac and takes a lot of resources. Have you tried to turn off catalog compression?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×