wmconlon Posted March 18, 2004 Report Share Posted March 18, 2004 I cannot get a backup to finish from Redhat to Mac OS 9.1. I have two volumes selected for backup: / and /boot. / always fails, but /boot finishes. On the Mac, I get a 519 error report while attempting to backup /. Then Retrospect moves on to /boot and completes successfully. This suggests to me that the problem isn't really a communication error. The log on the linux box shows: # tail -f /var/log/retroclient.log 1079629887: iplud: bound to address 0.0.0.0 1079629887: ipludAddMembership: adding membership for 0.0.0.0 1079629893: IPNSRegister(0): registered: "white"/"7093841c7b286ba5" 1079629893: ConnStartListen: starting thread ConnStartListen for 192.168.181.240:0 1079629899: IPNSRegister(0): registered: "white"/"7093841c7b286ba5" 1079629937: Connection established by 192.168.181.252:50285 1079629937: ConnReadData: Connection with 192.168.181.252:50285 was reset 1079629937: Connection established by 192.168.181.252:50286 1079629939: ConnReadData: Connection with 192.168.181.252:50286 was reset 1079629941: Connection established by 192.168.181.252:50287 1079632487: ConnReadData: Connection with 192.168.181.252:50287 was reset 1079632487: ConnWriteData: send() failed with error 9 1079632487: ConnWriteData: send() failed with error 9 1079632677: Connection established by 192.168.181.252:50288 1079633481: ConnReadData: Connection with 192.168.181.252:50288 was reset 1079633490: Connection established by 192.168.181.252:50290 1079633490: ConnReadData: Connection with 192.168.181.252:50290 was reset 1079633790: Connection established by 192.168.181.252:50298 1079633790: ConnReadData: Connection with 192.168.181.252:50298 was reset From this point on there is a continuing sequence of connections being established and reset. I then have to kill and start rcl to get another backup going. There are several curious things: 1. Why does this bind to 0.0.0.0 when rcl explicitly states $CLIENTDIR/retroclient -daemon -ip 192.168.181.240 2. Why does the backup server at 192.168.181.252 keep trying all the high ports? I though it used 497? 3. What does ConnWriteData: send() failed with error 9 mean? Also related to debugging: 1. Why doesn't the client report real timestamps in the log? 2. Why is the log cleared when I restart the client. It seems to me that it should be appended to. Link to comment Share on other sites More sharing options...
wmconlon Posted March 24, 2004 Author Report Share Posted March 24, 2004 Just a follow-up: I'm finally getting occasional backups to complete, though I continue to get 519 communication failures. Essentially, the problem above would occur after several hundred megabytes were copied. By stopping and starting the rcl client on redhat, I could finally get everything backed up. 1. Of course, comparison wasn't occurring because failure would occur during the backup. 2. Adding a new backupset will cause this problem to rear its ugly head again. 3. I also get 519 (communication) errors when the backup server tries to connect, saying the client can't be reserved; this requires rcl restart. I'm pretty sure the network is not the problem, as this Dell server has high quality Intel NICs, and its file sharing performance(SMB and Netatalk) has been tested. I've looked at the threads regarding dual NICs, but asure a configuration issue doesn't jump out at me. Link to comment Share on other sites More sharing options...
natew Posted March 26, 2004 Report Share Posted March 26, 2004 Hi Sad to say I don't have good answers for you about the logging. One thing you might want to try is allocating more memory to Retrospect on the backup machine. A memory management issue could at least explain why the backup fails after a large amount of data has been transferred. Nate Link to comment Share on other sites More sharing options...
wmconlon Posted March 29, 2004 Author Report Share Posted March 29, 2004 I agree that memory is probably the issue. I'll move the backup server onto a system with more memory. Link to comment Share on other sites More sharing options...
wmconlon Posted April 1, 2004 Author Report Share Posted April 1, 2004 Moved Retrospect 5.1 to another OS9 system with twice as much (256 MB vs 104 MB) RAM. Increased memory allocation to 200 MB. It still cannot backup my linux client. I then clicked the Preview button and started getting Net Retry messages after scanning about 13768 folders and about 180000 files. Nonetheless, about an hour later there were two windows listing files to be marked (one window for /boot, with 10 MB of datea and one for /, with 8 GB of data). I then selected backup and got a 541 error (client not installed or not running). The client shows: # ./rcl status Server "white": Version 6.5.108 back up according to normal schedule currently on readonly is off exclude is off 1 connections, 1 authenticated So the discrepancy is that the client thinks the server is connected and authenticated, but the server has somehow dropped the connection and can't reconnect). The log is uninformative without real time stamps: # tail -f /var/log/retroclient.log 1080764610: Connection established by 192.168.181.220:49152 1080764619: ConnReadData: Connection with 192.168.181.220:49152 was reset 1080770139: Connection established by 192.168.181.220:49152 1080770139: ConnReadData: Connection with 192.168.181.220:49152 was reset 1080770139: ServicePurge: service not found 1080770140: Connection established by 192.168.181.220:49153 1080772847: ConnWriteData: send() failed with error 104 1080772847: ConnWriteData: send() failed with error 32 1080772847: ConnReadData: Connection with 192.168.181.220:49153 closed 1080773507: Connection established by 192.168.181.220:49154 Does anyone know what the errors 104 and 32 mean? And why is the connection shown in the status window as connected and authenticated, but shown in the log as reset and closed? There seems to be a failure to communicate (and to trap and handle errors) between client and server. Link to comment Share on other sites More sharing options...
wmconlon Posted April 1, 2004 Author Report Share Posted April 1, 2004 I'm inclined to believe that the failure has to do with some contention between rcl and other processes. I started another immediate backup just before leaving the office yesterday afternoon -- and it completed. Then it completed again as part of the daily backup script at 10pm. My goal is to migrate from an old AppleShare server onto this linux system, but we need reliable backups first. The existing AppleShare server (running web, file, print, SMB, mail, DNS, FTP) NEVER fails to backup properly. But the rcl client on linux seems troubled when there is any activity, even though top typically shows 97 to 99% idle CPU. Link to comment Share on other sites More sharing options...
wmconlon Posted May 28, 2004 Author Report Share Posted May 28, 2004 This is a continuation of the same problems. Once the backups became incremental (only a few hundred to a thousand files), linux backup has been reliable. For about two months, I haven't had any issues with the linux client. This week, the backup server started reporting 505 errors, (client reserved). Yet # /etc/init.d/rcl status Server "white": Version 6.5.108 reserved by xxxxxxxxxx for firewire backup back up according to normal schedule currently on readonly is off exclude is off 1 connections, 1 authenticated xxxx above is the name of the backup server. the retroclient.log continues to be uninformative (esp, w/o real time stamps): 1085426088: ConnStartListen: starting thread ConnStartListen for 127.0.0.1:0 1085426088: iplud: bound to address 0.0.0.0 1085426088: ipludAddMembership: adding membership for 0.0.0.0 1085426094: IPNSRegister(0): registered: "white"/"7093841c7b286ba5" 1085426094: ConnStartListen: starting thread ConnStartListen for 192.168.181.240:0 1085426100: IPNSRegister(0): registered: "white"/"7093841c7b286ba5" 1085498094: Connection established by 192.168.181.220:49198 1085539755: Connection established by 192.168.181.220:49156 1085539755: ConnReadData: Connection with 192.168.181.220:49156 was reset 1085539755: ServicePurge: service not found 1085550204: Connection established by 192.168.181.220:49161 1085550204: ConnReadData: Connection with 192.168.181.220:49161 was reset 1085550204: ServicePurge: service not found 1085636522: Connection established by 192.168.181.220:49165 1085636522: ConnReadData: Connection with 192.168.181.220:49165 was reset 1085636522: ServicePurge: service not found 1085722959: Connection established by 192.168.181.220:49169 1085722959: ConnReadData: Connection with 192.168.181.220:49169 was reset 1085722959: ServicePurge: service not found 192.168.181.220 is the address of the baciup server. Interestingly, this trouble began AFTER rebooting the linux machine. Sure would be nice to have real logging to debug this. Anyone at Dantz listening? Link to comment Share on other sites More sharing options...
natew Posted June 1, 2004 Report Share Posted June 1, 2004 Hi You can turn up client logging with the retroclient -log x, 9 being the highest. I would disable virtual memory on the backup machine and allocate 50 or 60 MB to Retrospect. Any chance you can try running this backup on an OSX machine? Thanks Nate Link to comment Share on other sites More sharing options...
wmconlon Posted June 29, 2004 Author Report Share Posted June 29, 2004 thx, I'll try the logging feature when this next crops up. Regarding OSX, I can certainly run a backup with this version, but I've posted similar problems with OS X as the server, and the OSX version will not let me backup my AppleShareIP server, while the OS9 version does. Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.