Jump to content

Linux client keeps deferring backup.


Recommended Posts

Hello,

 

One of my linux clients keeps giving me a 'user deferred' error and won't backup. Great thing is, no user is deferring it. Its a server, whats worse is its our fileserver, it desperately needs backed up. All the other linux clients we have here are working great. So now I need to know why this client is deferring backup, so that I can fix it.

 

I haven't seen anything useful in the logs, but I'll include some in hopes that somebody else can use them.

 

Thanks

- miah

 

<some logs>

1056524229: connLogin: successful

1056524229: connTCPConnection: conn = 135205656, code = 109, tid = 0, count = 34

1056524229: connServiceSet: newID = 4, name = <pw>, ver = 34 curID = 0

1056524229: connTCPConnection: conn = 135205656, code = 113, tid = 0, count = 4

1056524229: connTCPConnection: conn = 135205656, code = 105, tid = 0, count = 0

1056524229: connDoLogout: logged out, sent

1056524229: connTCPConnection: conn = 135205656, code = 104, tid = 0, count = 36

1056524229: connLogin: successful

1056524229: connTCPConnection: conn = 135205656, code = 101, tid = 0, count = 0

1056524229: connTCPConnection: conn = 135205656, code = 106, tid = 0, count = 36

1056524229: connTCPConnection: conn = 135205656, code = 101, tid = 0, count = 0

1056524229: connTCPConnection: conn = 135205656, code = 109, tid = 0, count = 34

1056524229: connServiceSet: newID = 2, name = RfsCountdown, ver = 34 curID = 0

1056524229: connTCPConnection: conn = 135205656, code = 113, tid = 0, count = 4

1056524229: connTCPConnection: conn = 135205656, code = 120, tid = 1, count = 44

1056524229: TransStart: Handle 8 open

1056524229: TransStart: Handle 9 open

1056524229: transSpawn: starting thread transSpawnTop

1056524229: SThreadSpawn: starting thread 12982280

1056524229: CntCmd: startup for rfeD (1)

1056524229: ServDone_bi (1): result 0

1056524229: transSpawnTop: calling TransStop

1056524229: TransStop: send = -1, result = 540

1056524229: ServClear: Handle 8 closed

1056524229: transSpawnTop: Handle 9 closed

1056524229: transSpawnTop: Handle 8 closed

1056524229: sThreadExit: exiting thread 12982280

1056524229: connTCPConnection: conn = 135205656, code = 108, tid = 0, count = 0

1056524229: connTCPConnection: conn = 135205656, code = 101, tid = 0, count = 0

1056524230: ConnReadData: Connection with 192.168.25.100:3367 closed

1056524230: NetSockDel: removing socket 7

1056524230: connDoLogout: logged out

1056524230: connAccept: socket 7 deleted from interface 2

1056524230: sThreadExit: exiting thread 12981255

1056524230: iplud: got 196 bytes from 192.168.25.100:3368

1056524230: CmdLookupSN

collison: 0xb95ea0f1

name: ""

sn: "<serialnumberiguess>"

remoteAddr: 0.0.0.0:0

requestAddr: 192.168.25.100:3369

group: 2

OS type: 0

platform: 0

1056524230: connListener: starting thread connAccept

1056524230: connListener: Handle 7 open

1056524230: SThreadSpawn: starting thread 12983303

1056524230: Connection established by 192.168.25.100:3370

1056524230: connSetOptions: Changing TCP_NODELAY from 0 to 1

1056524230: NetSockAdd: adding socket 7 to interface 2

1056524230: connTCPConnection: conn = 135204824, code = 101, tid = 0, count = 0

1056524230: connTCPConnection: conn = 135204824, code = 112, tid = 0, count = 4

1056524230: connTCPConnection: conn = 135204824, code = 101, tid = 0, count = 0

1056524230: connTCPConnection: conn = 135204824, code = 112, tid = 0, count = 4

1056524230: connTCPConnection: conn = 135204824, code = 104, tid = 0, count = 36

 

Link to comment
Share on other sites

Ok, so I forgot a little bit of information. Here goes:

 

Linux client is running on a RaidZone ide raid system. It has ext2 and reiserfs file systems on it. One filesystem is 1TB(We're currently using around 350gigs of that though). It is running a proprietary version of Redhat 7.1, and running retroclient 6.5.105. (I have no idea if there are newer clients, the ftp site is a complete mess, the one tarball I found with unix clients had versions 6.0.x)

 

The backup server is a DellPoweredge 2500, backing up to a Seagate Viper2000 tapelibrary. Its running Retrospect 6.5.289(?), but I'm downloading the newest update and will be applying that in a minute.

 

Thanks

-miah

Link to comment
Share on other sites

Ok, my linux clients are now all 6.5.108, and retrospect is upgraded on the server. Still getting the deferred message on the linux client though.

 

The only thing that I could think woudl be causing this now would be some sort of client site configuration issue.

 

Thanks

-miah

Link to comment
Share on other sites

Wow, thanks for all the replys, you've really helped <sarcasm> anyways. The system for some reason finally decided to start working properly. I have no idea what caused it. I did use retrocpl to set the backup to be ASAP, and set it to read-only. When I came in this morning it was doing the backup. Sadly it will take a long time to backup (300gb), but it wont be too bad after that initial backup.

 

I just hope that it keeps working.

 

-miah

 

mango.gif

Link to comment
Share on other sites

Guest psykoyiko

Quote:

miah said:

Wow, thanks for all the replys, you've really helped <sarcasm> anyways. The system for some reason finally decided to start working properly. I have no idea what caused it. I did use retrocpl to set the backup to be ASAP, and set it to read-only. When I came in this morning it was doing the backup. Sadly it will take a long time to backup (300gb), but it wont be too bad after that initial backup.

 

 


 

It is possible that there was some kind of corruption in the state files for the client, and that changing options with retrocpl removed that corruption.

 

If the problem happens again, I would suggest removing the client, and reinstalling it from scratch.

Link to comment
Share on other sites

Hrm, thats possible I suppose. I definately didn't touch /var/log/retroclient.state or any other retrospect files in /var/log. If this happens again, I'll try wiping those out and see what happens.

 

THanks

-miah

Link to comment
Share on other sites

Ok, so today I noticed this morning that the client was still marked *user deferred*. I had to stop the backup the other day because the speed wasn't going as fast as it should (i had to fix a problem with the switch putting the port in 10mbit mode). Now that its fixed the backup shouldn't take nearly as long, but the client keeps deferring the backup. I can't see any reasons.. I deleted retroclient.state, reset the password, and made sure the client was setup correctly in retrospect. And its STILL DEFERRING!! Obviously there is some problem that -log 10 isn't helping me find. I guess if I can't figure out this problem by monday I'm going to have to call the support line. I'd rather figure it out here though so that when somebody else has this problem they can solve it quickly.

 

 

 

-miah

Link to comment
Share on other sites

  • 4 weeks later...

Smalll update to this post, not that anybody from dantz cares, because its quite obvious that they don't read these posts. Proactive backup continually fails on this system, but if I setup a EasyScript and schedule the backup it will work fine. Makes no sense really. I called dantz and asked them about it and their Technical Support person told me to reinstall the client and reboot. Even though I reminded him that it was Linux.

 

ITS LINUX. YOU DONT NEED TO REBOOT TO FIX THINGS. Dantz technical support really needs to learn this, also reinstalling the client will not necessarily fix things either. I can only see it fixing things if one of the binaries the client installs somehow gets corrupted or the wrong permission settings.

 

Also I've noticed that the backup is going extremely slow. I get 12.6MB/min tops. This is on a 100FDX connection. The server has a gigabit card in it and its linked to the switch at 1000FDX.. So its definately not a network issue. Its definately not the tapes, I have a Seagate Viper2000 Tape Library system with 200GB LTO tapes. I'm seriously considering installing amanda for my linux systems and just using Retrospect for the few windows systems we have.

 

An extremely frustred,

-miah

Link to comment
Share on other sites

Guest psykoyiko

Quote:

miah said:

Ok, so I forgot a little bit of information. Here goes:

 

Linux client is running on a RaidZone ide raid system. It has ext2 and reiserfs file systems on it. One filesystem is 1TB(We're currently using around 350gigs of that though). It is running a proprietary version of Redhat 7.1, and running retroclient 6.5.105. (I have no idea if there are newer clients, the ftp site is a complete mess, the one tarball I found with unix clients had versions 6.0.x)

 

 


 

Can you clarify what you mean by "proprietary version of Redhat 7.1"?

Link to comment
Share on other sites

Its redhat 7.1 on a RAIDZONE 'nas', which basically means 'RAIDZONE' made a ide raidcontroller and sells systems running linux with their controller but hasn't released the source code to their driver for their controller so we're stuck running a outdated kernel and many other packages because upgrading anything might break it. Other than that though its linux, and linux is linux, as long as the binary for retroclient can run with the libc I have installed all should be fine. It does run, and it does perform backups when I do a EasyScript backup, but when I run in proactive mode the client continually says 'deferred' even when I set the client to back up ASAP. (If raidzone ever reads this, or any other vendor that likes releasing closed source drivers, please read this link: The Magic Cauldren http://catb.org/~esr/writings/magic-cauldron/ )

 

Im starting to care less and less about proactive not working now that I just have easyscript backing the system up along with my other systems. But as I have over 300GB of data on this system and a full backup at that blazing 12.5MB/min takes several days I'm really wondering what I should do.

 

Since this problem has started I've upgraded the NIC on the backup server, upgraded the ram, and the switch that both of these systems are hooked up to. My next attack is seeing about putting one of my gigabit cards in this RAIDZONE system, and double checking my cabling between the two systems. At that point it will have to prove that the slowness is due to the retrospect client.

 

-miah

Link to comment
Share on other sites

Miah

 

Is this NAS box set up with SAMBA at all? I wonder what kind of throughput you would get if you tried backing up the files via windows file sharing rather than via the client?

 

Another random Idea for you- is it possible to try adding another NIC to the NAS box? There are some cards/driver combinations handle the client much better. Generally you would see that kind of thing on a Windows box though...

 

It is possible that the real culprit is the network drivers/card on the backup machine. It might be worth trying a different nic there too

 

Nate

Link to comment
Share on other sites

The next thing I'd like to try is adding another nic to that box, but its such a pain. Since the vendor releases binary only drivers for their ide controller I cannot easily upgrade the kernel, so its running a ancient 2.4 which probably doesn't have good support for the gigabit nic I'd like to put into it.

 

The system does have Samba running, I will do some testing to see if that makes a difference. The only thing about using samba is I'll lose alot of permission settings that I'd like to keep, and I'd have to share the whole drive.

 

I have added a different nic to the backup server. I added a Intel PRO/1000 XT, its hooked up to a gigabit switch, and links at 1000FDX. I still only get between 12-20MB/min backups. I have a huge feeling that its the backup server since the speed issue is spread acrossed all systems. I've replaced the switch with a new Dell 5224 Gigabit switch, changed the network cable between the backup server and the switch, and made sure to connect that one client to the same switch.. Downtime on the client system that keeps deferring is difficult since its one of our main db systems and people run lots of experiments on it.

 

-miah

Link to comment
Share on other sites

Hi,

 

As a test, have you installed Retrospect on another machine to see if you get better backup speeds? That will at least tell us if the server is the problem.

 

I suspect your network hardware is fine - drivers however are a different story. How about another 100mb NIC instead of 100mb in either of the machines? Not anyones favorite choice but it still might be better than what you are getting now.

 

Nate

Link to comment
Share on other sites

I've replaced the nic in the backup server, and tried upgrading the drivers. Because I have about 15 clients setup and I get the same speed on all the systems I would have to say its the server. The clients vary from Dell PowerEdge 1650's to PowerVaults to Raidzone OpenNAS systems (all running linux), and of course a couple dell workstations running windows.. I don't really have any other hardware I could replace this server with and have my tape library work with easily. As well the amount of time it would take to set the whole thing up again is just not something I have. The server is a Dell PowerEdge 2500. P4 Xeon with 1gig of ram. Its completely up to date with bios and what not. Same for the tape library, all firmware/bios is up to date. Tape Library again is Seagate Viper2000. There may be some tweaks I can do in windows or on the tape library to increase speeds, but I don't know them as I'm not a windows guru.

 

-miah

Link to comment
Share on other sites

As I said, I can't easily install retrospect server to another system. But I did try something else, since I do have another tape-library installed in that system (Its a Dell PowerVault 120T DDS4 Library) I setup proactive backup to use it and let it go. I'm getting roughly the same speed with the backups, 16.4MB/min. Its pretty painful when I'm trying to backup a server that has 20gigs of files on it. I would definately like to try running the retrospect server on another system, but I don't have any hardware lieing around unused to do this at the moment, and the amount of time it would take to move stuff around I don't have either.

 

-miah

Link to comment
Share on other sites

Hi Miah

 

 

 

Sorry for not making this clear-

 

You can do a test install on another machine and use a "file" backup set rather than a "tape" backup set. Running a small backup should give us a decent idea about where the problem actually lies.

 

 

 

You said you get this kind of throughput on all machines on the network, including Windoze machines? Is this true with the test machine setup that you have? What kind of throughput are you getting when you backup the local machine?

 

 

 

Nate

Link to comment
Share on other sites

  • 1 year later...

While this topic is probably dead for those involved, I couldn't find the solution anywhere else, so thought I'd add it here for posterity.

 

I had the exact same problem with the client being deferred. This was on a RedHat ES server running the latest Linux client. Finally, I made the connection -- I don't have a "DISPLAY", since we manage everything via command line. The installer automatically puts the "RETROSPECT_HOME" environment variable in the system login (profile) script. I guessed that when the server was trying to back up the system, it was trying to put up a dialog. With no GUI, it failed -- deferring the backup.

 

My solution: comment out the "RETROSPECT_HOME" lines in /etc/profile. Other systems may need to do it in other places, but this is what worked for me -- finally!

Link to comment
Share on other sites

  • 2 years later...

Quote:

While this topic is probably dead for those involved, I couldn't find the solution anywhere else, so thought I'd add it here for posterity.

 

I had the exact same problem with the client being deferred. This was on a RedHat ES server running the latest Linux client. Finally, I made the connection -- I don't have a "DISPLAY", since we manage everything via command line. The installer automatically puts the "RETROSPECT_HOME" environment variable in the system login (profile) script. I guessed that when the server was trying to back up the system, it was trying to put up a dialog. With no GUI, it failed -- deferring the backup.

 

My solution: comment out the "RETROSPECT_HOME" lines in /etc/profile. Other systems may need to do it in other places, but this is what worked for me -- finally!

 


 

I had this problem too. A Retrospect Linux client 7.5.122 constantly deferred. This fix worked for us.

 

I am posting to push this thread up. Because this thread showed the problem in '03, the fix by a user in '05 and still no FAQ by EMC in '08.

Link to comment
Share on other sites

  • 3 months later...

Thanks for this thread- but mods to /etc/profile did not work.

 

I updated & patched my windows single server version 7.5.508 with hotfix 7.5.14.102. Yet continued to see the cycle of 'ready'...'defered' for a fedora client in proactive backup.

 

What did work was in Retrospect server

- select Automate...Proactive Backup...select the problem system from the list...select edit.

 

Now, click on the 'Options' button and click on "Countdown". If your "countdown time (seconds)" window is non-zero, then set it to zero.

 

You will notice the countdown message box is blanked. You will also notice that the next time proactive backup attempts to access the previously deferring client that proactive backup transitions from 'retry' to 'ready' to 'executing'.

 

Problem solved by this workaround (la di dah!)

 

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...