Jump to content

Retro Client 6.0.110 keeps turning itself off


Recommended Posts

Let me start the diagnosis by saying that the error 9 on the close() call indicates a bad file descriptor. So, either a file descriptor is being closed twice or memory is being "stepped on", causing a corrupt file descriptor. It would help to have a version of pitond that outputs more debug info, such as the value of the file descriptor in the failed close() call.

Link to comment
Share on other sites

Quote:

It would help to have a version of pitond that outputs more debug info

 


 

We do.

 

In a shell, type:

 

$ /Applications/Retrospect\ Client.app/Contents/Resources/pitond --help

 

to get:

 

--help Display this message

-setpw <password> Set the first access password.

Once the password is set, it can only

be changed from Retrospect

-testpw Test if the first access password has been set

-log n Set the logging level to n

Logs are saved in /var/log/retroclient.log

 

 

The log flag is 1 - 9, with 9 spewing log entries fast and furious.

 

You can modify the /Library/StartupItems/RetroClient/RetroClient script with this flag (at whatever value you find valuable) to keep pitond spewing across restarts.

 

Dave

Link to comment
Share on other sites

Many thanks. I'll post back with any findings. I'm a Unix software developer by trade, so I enjoy a good code mystery but only if I have the tools to find the clues.

 

 

 

[update] I ran pitond with a log level of 9 and discovered that the problem appears to be the closing of a file handle that was never opened (at least according to the debug info). The crash appears to happen in a function called NetCancelSockets when the machine starts going to sleep. Here's the last part of the log:

 

 

 

1132287973: NetCancelSockets: wait cancelling sockets

 

1132287973: NetCancelSockets: cancelling socket 9

 

1132287973: connAccept: Handle 9 closed

 

1132287973: connAccept: close(socket) failed with error 9

 

1132287973: Assertion failure at pitond/object.c-477

 

1132287973: LogFlush: program exit(-1) called, flushing log file to disk

 

 

 

There are a number of "connAccept: Handle <x> closed" messages in the log, each of which has a corresponding "TransStart: Handle <x> start" message preceding it at some point. Handle 9 is an exception; there is no corresponding TransStart message.

 

 

 

Hope this helps with the diagnosis. I'll keep checking to see if future crashes involve the same file handle.

 

 

 

[Yet another update] The next time it happened was with a different file handle. The only consistencies are that it's always happening inside NetCancelSockets and that it always involves a file handle being closed that was apparently never opened.

Link to comment
Share on other sites

I was able to reproduce the pitond crash (once) by putting my PowerBook to sleep while it was being scanned by the server. Here is the verbose output from the log, which is consistent with what Alan reports. I haven't checked yet, but Alan, is the crash 100% reproducible for you (by sleeping during scan)?

 

 

 

[update: I was only able to reproduce the crash 2 times in about 7 or 8 attempts.]

 

 

 

Here's the log ... the first line looks interesting to me as well. Now that I think about it, I'm being backed up over my wired Ethernet connection, but I sometimes have a wireless connection active as well. I wonder if that is related? Alan is your setup similar? I know in the past I've had problems under these circumstances where it couldn't even complete a scan without getting a network error. Turning off Airport fixed that. (Sorry, I'm just thinking out loud).

 

 

 

1132319735: netDelInterface: deleting interface 2

 

1132319735: iplud: interface for socket 4 deleted

 

1132319735: connListener: starting thread connAccept

 

1132319735: connListener: Handle 4 open

 

1132319735: SThreadSpawn: starting thread 25212416

 

1132319735: NetConnAdd: adding socket 4

 

1132319735: connAccept: Handle 4 closed

 

1132319735: NetConnDel: removing socket 4

 

1132319735: sThreadExit: exiting thread 25212416

 

1132319735: connListener: interface for socket 5 deleted

 

1132319735: sThreadExit: exiting thread 25169408

 

1132319735: sThreadExit: exiting thread 25168384

 

1132319735: connTCPConnection: conn = 3151360, code = 120, tid = 288, count = 24

 

1132319735: TransStart: Handle 4 open

 

1132319735: TransStart: Handle 5 open

 

1132319735: TransStart: starting 'GHst' on Builtin

 

1132319735: transSpawn: starting thread transSpawnTop

 

1132319735: SThreadSpawn: starting thread 25169408

 

1132319735: Builtin: startup for GHst (288)

 

1132319735: ServDone_bi (288): result 0

 

1132319735: ServClear: Handle 4 closed

 

1132319735: transSpawnTop: Handle 5 closed

 

1132319735: transSpawnTop: Handle 4 closed

 

1132319735: sThreadExit: exiting thread 25169408

 

1132319736: connTCPConnection: conn = 3151360, code = 120, tid = 289, count = 24

 

1132319736: TransStart: Handle 4 open

 

1132319736: TransStart: Handle 5 open

 

1132319736: TransStart: starting 'SNte' on Builtin

 

1132319736: transSpawn: starting thread transSpawnTop

 

1132319736: SThreadSpawn: starting thread 25168384

 

1132319736: Builtin: startup for SNte (289)

 

1132319736: ServDone_bi (289): result 0

 

1132319736: ServClear: Handle 4 closed

 

1132319736: transSpawnTop: Handle 5 closed

 

1132319736: transSpawnTop: Handle 4 closed

 

1132319736: sThreadExit: exiting thread 25168384

 

1132319736: connTCPConnection: conn = 3151360, code = 101, tid = 0, count = 0

 

1132319736: connTCPConnection: conn = 3151360, code = 120, tid = 290, count = 24

 

1132319736: TransStart: Handle 4 open

 

1132319736: TransStart: Handle 5 open

 

1132319736: TransStart: starting 'GDef' on Builtin

 

1132319736: transSpawn: starting thread transSpawnTop

 

1132319736: SThreadSpawn: starting thread 25169408

 

1132319736: Builtin: startup for GDef (290)

 

1132319736: ServDone_bi (290): result 0

 

1132319736: ServClear: Handle 4 closed

 

1132319736: transSpawnTop: Handle 5 closed

 

1132319736: transSpawnTop: Handle 4 closed

 

1132319736: sThreadExit: exiting thread 25169408

 

1132319736: connTCPConnection: conn = 3151360, code = 120, tid = 291, count = 24

 

1132319736: TransStart: Handle 4 open

 

1132319736: TransStart: Handle 5 open

 

1132319736: TransStart: starting 'GNst' on Builtin

 

1132319736: transSpawn: starting thread transSpawnTop

 

1132319736: SThreadSpawn: starting thread 25168384

 

1132319736: Builtin: startup for GNst (291)

 

1132319736: ServDone_bi (291): result 0

 

1132319736: ServClear: Handle 4 closed

 

1132319736: transSpawnTop: Handle 5 closed

 

1132319736: transSpawnTop: Handle 4 closed

 

1132319736: sThreadExit: exiting thread 25168384

 

1132319736: NetCancelSockets: wait cancelling sockets

 

1132319736: NetCancelSockets: cancelling socket 7

 

1132319736: connAccept: Handle 7 closed

 

1132319736: connAccept: close(socket) failed with error 9

 

1132319736: Assertion failure at pitond/object.c-477

 

1132319736: LogFlush: program exit(-1) called, flushing log file to disk

Link to comment
Share on other sites

I'm having similar problems. I'm just starting to investigate, but here's what I can share:

I run a network of 40 OS X machines, mostly 10.3 but a few are Tiger. I've found that a few of them have been having their Retrospect Clients "turned off". I think it's only the Tiger machines but I'm not sure yet. It seems to be happening almost once a day for some of my Tiger clients. I haven't seen anything in the logs yet though.

 

Just starting to investigate.

 

-dan

Link to comment
Share on other sites

Quote:

I did not check whether a scan was in progress. Were you checking that on the server side or the client side?

 


 

I had a VNC connection to the server, so I was looking at that. But I was also watching the client log file in the Console on the client. There is plenty of activity there during a scan too. On the other hand, even before/after a scan, there is activity in the client log when the server is just polling the various clients.

 

From my experience, I would be surprised if the bug was only triggered by sleeping during a scan. I don't sleep it during scans that often. Just so happens that the 2 times I was able to make it happen that's what it was doing. My guess is sleeping during any connection to the server has the potential to trigger it.

Link to comment
Share on other sites

  • 3 weeks later...

Might anyone know if Dantz has acknowledged this client turn off problem?

 

Since upgrading our system to all Tiger machines, most of the clients are powerbooks connected via wifi, the client app turns itself off. Sleep/Scan conflicts seems likely.

 

I'm also having a problem where the backup server gets caught in a loop with "net retry". Eventually, I have to stop the execution and than resume the server. Has anyone seen this occur?

 

Have there been any solutions found for the ptond shut down besides the Apple script?

Link to comment
Share on other sites

Nate,

I am currently running the 6.1.107 version which I believe to be the most current. I rans ome further tests and spoke to Dantz regarding this issue. What we've concluded is that the client turns itself off when the airport connection is turned off (or put to sleep). I call this a BUG as one does not keep a laptop running 24/7. Dantz is supposed to get back to me regarding whether it's a Bug or not.

Regarding the NetRetry failure, that might have been an anomaly. Time will tell.

Thank you,

Ben

Link to comment
Share on other sites

Dantz Knowledgebase...

 

TITLE: Net Retry error after upgrading to Retrospect 6.1

 

Discussion

 

Some users have reported NetRetry error messages when connecting to 6.1 or updating to the 6.1 version of the client software for Macintosh.

The 6.1 version of the Retrospect Client installer released prior to October 12, 2005 may experience a problem deleting the old retropds.log files on the client system, resulting in NetRetry errors. This issue has been fixed with the latest version of the client software available at http://www.dantz.com/updates

 

http://kb.dantz.com/display/2n/articleDirect/index.asp?aid=8119&r=0.2452661

Link to comment
Share on other sites

  • 1 month later...

Any solutions to the client disabling itself issues? I've tried the applescript mentioned above and it does not work. Also, in some of the workstations I'm seeing this on, there is no airport card or network so it can't be related to wireless network. Problem machines range from powerbooks to G5's running 10.3.9 and 10.4.4.

 

And regarding the Net Retry errors, this issue is still not resolved under the latest client (Oct 2005 release). I started a separate thread regarding it here: http://forums.dantz.com/ubbthreads/showflat.php?Cat=0&Number=66743&page=0&view=collapsed&sb=5&o=&fpart=1

 

In my opinion, this isn't a client issues since where I've seen the issue crop up, it's been a user leaving the network and Retrospect backup server not wanting to give up looking for it. Shouldn't Retrospect be able to give up after a set amount of time and move on if a client is not available after, say, 5 minutes? It used to under older versions.

Link to comment
Share on other sites

Quote:

Have you every moved the client application from its default location? That can cause this problem.

 


 

No, in all cases the client is left in its installed location /Applications. Any other suggestions?

 

I set up the other suggestion for the speed threshhold in preferences. That makes sense and I will see how it works.

 

Thanks for your help,

 

Shawn

Link to comment
Share on other sites

Quote:

Also, in some of the workstations I'm seeing this on, there is no airport card or network so it can't be related to wireless network. Problem machines range from powerbooks to G5's running 10.3.9 and 10.4.4.

 


 

By "this" do you mean the same logged crash that Alan reported in the first post of this thread:

 

1116545396: connAccept: close(socket) failed with error 9

1116545396: ServicePurge: service not found

1116545396: Assertion failure at pitond/object.c-477

1116545396: LogFlush: program exit(-1) called, flushing log file to disk

 

- Does pitond start correctly after a system restart?

Link to comment
Share on other sites

Quote:

By "this" do you mean the same logged crash that Alan reported in the first post of this thread:

 


 

I'm not onsite with any of my clients (I'm a consultant who supports a number of different studios and workgroups) where this happened to, so I'll need to investigate further. However, I just noticed this morning my client was turned off. Turned it back on, but that appears to wipe the retroclient.log, so the next time it happens, I will check before re-enabling the client.

 

BTW, in some of the cases that I've seen this happen, Airport could not have been a factor because there is no card installed in the machine or no wireless network on site. In my case, my Powerbook has an Airport card, but I need to check the log the next time my client turns itself off.

 

Thanks,

Shawn

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...