  1. miah

    linux client crashing

    I'm getting this error as well, though the client is running on a RH 9.0 box. I have the same client running on other RH9.0 boxes without error, which is the strange part. The longest I've seen the client run is around 24hours, I have nagios monitoring the port that the client listens on and I get a notice every morning that it has failed. My plan today is to write a script that restarts it, and just have that run hourly through cron.
  2. As I said, I can't easily install retrospect server to another system. But I did try something else, since I do have another tape-library installed in that system (Its a Dell PowerVault 120T DDS4 Library) I setup proactive backup to use it and let it go. I'm getting roughly the same speed with the backups, 16.4MB/min. Its pretty painful when I'm trying to backup a server that has 20gigs of files on it. I would definately like to try running the retrospect server on another system, but I don't have any hardware lieing around unused to do this at the moment, and the amount of time it would take to move stuff around I don't have either. -miah
  3. I've replaced the nic in the backup server, and tried upgrading the drivers. Because I have about 15 clients setup and I get the same speed on all the systems I would have to say its the server. The clients vary from Dell PowerEdge 1650's to PowerVaults to Raidzone OpenNAS systems (all running linux), and of course a couple dell workstations running windows.. I don't really have any other hardware I could replace this server with and have my tape library work with easily. As well the amount of time it would take to set the whole thing up again is just not something I have. The server is a Dell PowerEdge 2500. P4 Xeon with 1gig of ram. Its completely up to date with bios and what not. Same for the tape library, all firmware/bios is up to date. Tape Library again is Seagate Viper2000. There may be some tweaks I can do in windows or on the tape library to increase speeds, but I don't know them as I'm not a windows guru. -miah
  4. The next thing I'd like to try is adding another nic to that box, but its such a pain. Since the vendor releases binary only drivers for their ide controller I cannot easily upgrade the kernel, so its running a ancient 2.4 which probably doesn't have good support for the gigabit nic I'd like to put into it. The system does have Samba running, I will do some testing to see if that makes a difference. The only thing about using samba is I'll lose alot of permission settings that I'd like to keep, and I'd have to share the whole drive. I have added a different nic to the backup server. I added a Intel PRO/1000 XT, its hooked up to a gigabit switch, and links at 1000FDX. I still only get between 12-20MB/min backups. I have a huge feeling that its the backup server since the speed issue is spread acrossed all systems. I've replaced the switch with a new Dell 5224 Gigabit switch, changed the network cable between the backup server and the switch, and made sure to connect that one client to the same switch.. Downtime on the client system that keeps deferring is difficult since its one of our main db systems and people run lots of experiments on it. -miah
  5. Its redhat 7.1 on a RAIDZONE 'nas', which basically means 'RAIDZONE' made a ide raidcontroller and sells systems running linux with their controller but hasn't released the source code to their driver for their controller so we're stuck running a outdated kernel and many other packages because upgrading anything might break it. Other than that though its linux, and linux is linux, as long as the binary for retroclient can run with the libc I have installed all should be fine. It does run, and it does perform backups when I do a EasyScript backup, but when I run in proactive mode the client continually says 'deferred' even when I set the client to back up ASAP. (If raidzone ever reads this, or any other vendor that likes releasing closed source drivers, please read this link: The Magic Cauldren http://catb.org/~esr/writings/magic-cauldron/ ) Im starting to care less and less about proactive not working now that I just have easyscript backing the system up along with my other systems. But as I have over 300GB of data on this system and a full backup at that blazing 12.5MB/min takes several days I'm really wondering what I should do. Since this problem has started I've upgraded the NIC on the backup server, upgraded the ram, and the switch that both of these systems are hooked up to. My next attack is seeing about putting one of my gigabit cards in this RAIDZONE system, and double checking my cabling between the two systems. At that point it will have to prove that the slowness is due to the retrospect client. -miah
  6. Smalll update to this post, not that anybody from dantz cares, because its quite obvious that they don't read these posts. Proactive backup continually fails on this system, but if I setup a EasyScript and schedule the backup it will work fine. Makes no sense really. I called dantz and asked them about it and their Technical Support person told me to reinstall the client and reboot. Even though I reminded him that it was Linux. ITS LINUX. YOU DONT NEED TO REBOOT TO FIX THINGS. Dantz technical support really needs to learn this, also reinstalling the client will not necessarily fix things either. I can only see it fixing things if one of the binaries the client installs somehow gets corrupted or the wrong permission settings. Also I've noticed that the backup is going extremely slow. I get 12.6MB/min tops. This is on a 100FDX connection. The server has a gigabit card in it and its linked to the switch at 1000FDX.. So its definately not a network issue. Its definately not the tapes, I have a Seagate Viper2000 Tape Library system with 200GB LTO tapes. I'm seriously considering installing amanda for my linux systems and just using Retrospect for the few windows systems we have. An extremely frustred, -miah
  7. Modifying the services file will do nothing to fix this problem. Adding retrospect to inetd will only make it not work. The retrospect client has the ability to bind to the port and listen for connections. You only really use inetd for applications that do not have this ability. If you add it to inetd and try to start retroclient it will fail because inetd has already bound itself to the port. And seriously, if you don't have software that requires SCO Unix, dump it. Its garbage, you can get Linux for free and it obviously works alot better. -miah
  8. I belive that the 'Xwindows stuff' requires Java too. Looking at RetroClient.sh I notice several temp races (simple security problems that should be fixed), and a comment about exiting if JAVA_HOME isn't set. I run the client on several linux systems and it works fine, I've never once loaded the "Xwindows client" and don't ever intend to. -miah
  9. You could test out the firewall from your end. You can either use a portscanner, or just telnet, if she has a real firewall setup you should only be able to connect to a few select ports. So try going to a cmd prompt, and doing 'telnet <host> 450 (if there is nothing on port 450), if she has that firewalled off it will just sit there saying 'connecting to <hostname>' and then eventually it will give you a 'Could not connect to host on port 450: Connect failed' message. So then you can try the same thing, but use 497, if the client is running, and the port is unfirewalled telnet will go in and you'll just have a flashing cursor, you wont see anything you type and nothing will really happen. Hit 'Control ]' and then type 'quit'. The port scanner might be easier, but if she was just cracked, it might trigger some defense mechanisms and automagically firewall you off (some people setup stuff like this because they're paranoid). If you can connect via telnet or a portscanner sees it as open, have her run retrocpl and show you the output, also a netstat -nap on that system would help, and iptables -n -L. Make sure the firewall rules are messing with port 497 tcp/udp in bad ways, and make sure the client is running and shows up in netstat as LISTENING for tcp (it wont say listening for udp, but it will still show it running and bound to port 497). Hope that helps. -miah
  10. Ok, so today I noticed this morning that the client was still marked *user deferred*. I had to stop the backup the other day because the speed wasn't going as fast as it should (i had to fix a problem with the switch putting the port in 10mbit mode). Now that its fixed the backup shouldn't take nearly as long, but the client keeps deferring the backup. I can't see any reasons.. I deleted retroclient.state, reset the password, and made sure the client was setup correctly in retrospect. And its STILL DEFERRING!! Obviously there is some problem that -log 10 isn't helping me find. I guess if I can't figure out this problem by monday I'm going to have to call the support line. I'd rather figure it out here though so that when somebody else has this problem they can solve it quickly. -miah
  11. Hrm, thats possible I suppose. I definately didn't touch /var/log/retroclient.state or any other retrospect files in /var/log. If this happens again, I'll try wiping those out and see what happens. THanks -miah
  12. Wow, thanks for all the replys, you've really helped <sarcasm> anyways. The system for some reason finally decided to start working properly. I have no idea what caused it. I did use retrocpl to set the backup to be ASAP, and set it to read-only. When I came in this morning it was doing the backup. Sadly it will take a long time to backup (300gb), but it wont be too bad after that initial backup. I just hope that it keeps working. -miah
  13. Ok, my linux clients are now all 6.5.108, and retrospect is upgraded on the server. Still getting the deferred message on the linux client though. The only thing that I could think woudl be causing this now would be some sort of client site configuration issue. Thanks -miah
  14. Ok, so I forgot a little bit of information. Here goes: Linux client is running on a RaidZone ide raid system. It has ext2 and reiserfs file systems on it. One filesystem is 1TB(We're currently using around 350gigs of that though). It is running a proprietary version of Redhat 7.1, and running retroclient 6.5.105. (I have no idea if there are newer clients, the ftp site is a complete mess, the one tarball I found with unix clients had versions 6.0.x) The backup server is a DellPoweredge 2500, backing up to a Seagate Viper2000 tapelibrary. Its running Retrospect 6.5.289(?), but I'm downloading the newest update and will be applying that in a minute. Thanks -miah
  15. Hello, One of my linux clients keeps giving me a 'user deferred' error and won't backup. Great thing is, no user is deferring it. Its a server, whats worse is its our fileserver, it desperately needs backed up. All the other linux clients we have here are working great. So now I need to know why this client is deferring backup, so that I can fix it. I haven't seen anything useful in the logs, but I'll include some in hopes that somebody else can use them. Thanks - miah <some logs> 1056524229: connLogin: successful 1056524229: connTCPConnection: conn = 135205656, code = 109, tid = 0, count = 34 1056524229: connServiceSet: newID = 4, name = <pw>, ver = 34 curID = 0 1056524229: connTCPConnection: conn = 135205656, code = 113, tid = 0, count = 4 1056524229: connTCPConnection: conn = 135205656, code = 105, tid = 0, count = 0 1056524229: connDoLogout: logged out, sent 1056524229: connTCPConnection: conn = 135205656, code = 104, tid = 0, count = 36 1056524229: connLogin: successful 1056524229: connTCPConnection: conn = 135205656, code = 101, tid = 0, count = 0 1056524229: connTCPConnection: conn = 135205656, code = 106, tid = 0, count = 36 1056524229: connTCPConnection: conn = 135205656, code = 101, tid = 0, count = 0 1056524229: connTCPConnection: conn = 135205656, code = 109, tid = 0, count = 34 1056524229: connServiceSet: newID = 2, name = RfsCountdown, ver = 34 curID = 0 1056524229: connTCPConnection: conn = 135205656, code = 113, tid = 0, count = 4 1056524229: connTCPConnection: conn = 135205656, code = 120, tid = 1, count = 44 1056524229: TransStart: Handle 8 open 1056524229: TransStart: Handle 9 open 1056524229: transSpawn: starting thread transSpawnTop 1056524229: SThreadSpawn: starting thread 12982280 1056524229: CntCmd: startup for rfeD (1) 1056524229: ServDone_bi (1): result 0 1056524229: transSpawnTop: calling TransStop 1056524229: TransStop: send = -1, result = 540 1056524229: ServClear: Handle 8 closed 1056524229: transSpawnTop: Handle 9 closed 1056524229: transSpawnTop: Handle 8 closed 1056524229: sThreadExit: exiting thread 12982280 1056524229: connTCPConnection: conn = 135205656, code = 108, tid = 0, count = 0 1056524229: connTCPConnection: conn = 135205656, code = 101, tid = 0, count = 0 1056524230: ConnReadData: Connection with closed 1056524230: NetSockDel: removing socket 7 1056524230: connDoLogout: logged out 1056524230: connAccept: socket 7 deleted from interface 2 1056524230: sThreadExit: exiting thread 12981255 1056524230: iplud: got 196 bytes from 1056524230: CmdLookupSN collison: 0xb95ea0f1 name: "" sn: "<serialnumberiguess>" remoteAddr: requestAddr: group: 2 OS type: 0 platform: 0 1056524230: connListener: starting thread connAccept 1056524230: connListener: Handle 7 open 1056524230: SThreadSpawn: starting thread 12983303 1056524230: Connection established by 1056524230: connSetOptions: Changing TCP_NODELAY from 0 to 1 1056524230: NetSockAdd: adding socket 7 to interface 2 1056524230: connTCPConnection: conn = 135204824, code = 101, tid = 0, count = 0 1056524230: connTCPConnection: conn = 135204824, code = 112, tid = 0, count = 4 1056524230: connTCPConnection: conn = 135204824, code = 101, tid = 0, count = 0 1056524230: connTCPConnection: conn = 135204824, code = 112, tid = 0, count = 4 1056524230: connTCPConnection: conn = 135204824, code = 104, tid = 0, count = 36