Jump to content

Troubleshooting unattended backup


arnstein

Recommended Posts

I'm troubleshooting a problem where my computer runs out of memory and becomes unusable. This happens during backup jobs.

 

Last night I had three backup jobs scheduled:

1. Backup to tape, backup set 1.

2. Backup to (disk) file, backup set 2.

3. Backup to tape, backup set 3.

 

No one was logged in during these backup jobs.

 

Backup job 1 took a long time. As a result, all three backup jobs executed "back to back," with no dead time between executions.

 

Backup jobs 1 and 2 were successful. The tape drive did NOT eject the tape after backup job 1 completed. No surprise here.

 

Backup job 3 appears to have hung. Rather, the whole computer hung, with messages about exhausted memory in the Windows 2000 event log file. When I finally power cycled the computer in the morning, a Windows 2000 event log message was written that said that Retrospect.exe had prevented the computer from entering suspend state.

 

My specific question is what was backup job 3 doing. Consider that the wrong tape was in the tape drive when it began execution. No one was logged in at the time. In this scenario, would the Retrospect software attempt to present a dialog box on the computer screen, asking that the correct tape be inserted in the drive? Or would the software take some other action?

 

The environment is Windows 2000 service pack 4. 512 MBytes of DRAM. Pentium 4 Dell PC. Latest version of Retrospect 6.5 and device drivers downloaded from Dantz web site.

 

Thanks for any info on the behavior of Retrospect software in this scenario. This is a recurring problem for me and I'm keen to resolve it.

Link to comment
Share on other sites

Hi David,

 

Retrospect will not put a media request window on the logon screen- you have to login for that and you have to have that particular execution unit highlighted in the activity monitor. Unless you set it to do otherwise Retrospect just sits and waits for media if the correct tape is present. You can set up email notification for media requests to avoid this.

 

What does the Retrosepect log say about this third execution? What are the sources and destinations involved with the third script? How are they different from the first 2?

 

Thanks

 

Nate

 

 

Link to comment
Share on other sites

Quote:

natew said:

What does the Retrosepect log say about this third execution? What are the sources and destinations involved with the third script? How are they different from the first 2?

 


OK, here's a simpler example. Last night, I scheduled the following two backup jobs:

 

Backup job 1:

Scheduled at 3:00 AM. Source was several local disk partitions, plus system state. Destination was a (disk) file.

 

Backup job 2:

Scheduled at 4:00 AM. Source was same as backup job 1. Destination was an Ecrix VXA-1 SCSI tape drive. The tape drive was powered up, but no tape cartridge was placed in it. This is important.

 

Both backup jobs were Retrospect "normal" backups onto an existing backup set.

 

Backup job 1 was successful, and took about 30 minutes to complete.

 

I had some monitoring scripts running, so I was able to observe what happened next, at least partially:

 

The program launcher.exe ran from 3:00 to 6:00 AM continuously (same PID). At no time did its memory usage exceed 1.1 MBytes. No other process on the computer used any more memory than this, either.

 

The entire computer hung at around 6:00 AM. There was an event log message stating that system memory was getting dangerously low. I guess that my monitoring scripts failed to display the process that was hogging all that memory. Maybe it was a large number of separate processes, each using a small amount of memory? I know my scripts would have missed that scenario. I only recorded the three biggest users of memory, and only every 10 minutes.

 

Retrospect was unable to write a log file for backup job 2. However, the Retrospect activity log "knew" that the job ran.

 

Perhaps two or more Retrospect programs tried to communicate with each other via TCP/IP or UDP/IP and failed? But I checked my software firewall log (Norton Internet Security 2003) and there were no complaints therein.

 

The above two backup jobs are scheduled every night and if a tape is in the Ecrix drive, everything works fine. Also, if the tape drive is powered off, Retrospect detects this problem and exits immediately, rather than hanging around as in the above case.

 

The computer is a new Dell with a Pentium 4, 512 MBytes of DRAM, 1 GByte of swap space. Windows 2000 service pack 4. The latest version of Retrospect Professional, and the device drivers, both downloaded from the Dantz web site.

 

Any other ideas?

Link to comment
Share on other sites

Hi

 

 

 

Thanks for the details-

 

 

 

I suspect we are running into a "SCSI voodo" issue. Can you test and see if the problem only happens when trying to backup to the tape? To do this set up another disk backup set but don't give it any members to back up to. Retrospect should sit there in the same media request state.

 

 

 

You can also set the media request timeout to 20 minutes so that Retrospect is forced to move on to the next job.

 

 

 

One thing - make sure you have disabled the tape drive in device manager. Retrospect has it's own drivers.

 

 

 

Nate

Link to comment
Share on other sites

Quote:

natew said:

One thing - make sure you have disabled the tape drive in device manager. Retrospect has it's own drivers.

 


I did that. In Device Manager, the tape drive is labeled with a red "X" icon now.

 

Here's another data point. Last night, I did a normal backup onto my tape drive. Once again, my computer was hung in the morning (no one was logged in during the night).

 

This much I know:

 

1. The backup job reached the physical end of the tape cartridge. The tape cartridge was ejected from the drive.

 

2. After I rebooted the computer, I launched Retrospect. I looked at the properties of the (tape) backup set. I was able to browse the backup session that failed, and I could see that the last partition in the source set was only partially backed up.

 

3. The "history" tab of the Retrospect activity monitor listed the failed backup job. However, there was no log file available for the backup job.

 

4. The backup job completed at about 4:30 AM. The computer hung at around 6:10 AM.

 

2.gif Perhaps the Retrospect software has a problem on my machine when it is checking the tape drive? I expect that it would be doing this periodically, to see if someone had inserted a blank tape cartridge.

 

2.gif Perhaps the Retrospect software has a problem on my machine when it is checking to see if a user has logged on? I expect that it would be doing this periodically, so that it could ask the user to insert a blank tape cartridge.

 

I'll work on your suggestion regarding a non-tape backup job soon.

 

Any other suggestions from Retrospect? Thanks for your help.

Link to comment
Share on other sites

Hi

 

So to narrow this down, It looks like the machine hangs after a tape gets full and it waits a while for media- that sound right?

 

I don't think there is a lot of back and forth communication between the drive and Retrospect in that situation. Retrospect should just sit and wait until it gets a command from the tape drive that a tape has been inserted. There are ways to prove this but I think you should try the disk backup set idea first.

 

Nate

Link to comment
Share on other sites

Quote:

natew said:
Can you test and see if the problem only happens when trying to backup to the tape? To do this set up another disk backup set but don't give it any members to back up to. Retrospect should sit there in the same media request state.

 

One thing - make sure you have disabled the tape drive in device manager. Retrospect has it's own drivers.

 


I set up a backup job whose destination is CD/DVD. I did not have any media in any drive. I scheduled the backup job to run while no one was logged in. Eight hours later, the computer was in such a state that I couldn't log in to it. I had to reboot it.

 

Is this the test that you wanted me to run?

Link to comment
Share on other sites

Hi

 

Yes that is exactly it. So we can rule out device issues- this is purely a memory management problem. (as if we didn't know that already)

 

There are 2 files that when corrupt can affect memory usage in Retrospect. They are both in Documents and settings/all users/application data (hidden)/Retrospect/ config6x.dat and config6x.bak. I would move those files somewhere else on the machine and try the backup again. You will need to set up your script again but you can use your existing backup sets if you like.

 

The other posibilty is that you have a flaky RAM chip. Retrospect can be very memory intensive so it could be filling up your ram more than you normally would, exposing a section of bad RAM that is not accessed very often. I would download a ram test and see.

 

Nate

Link to comment
Share on other sites

Quote:

natew said:
There are 2 files that when corrupt can affect memory usage in Retrospect. They are both in Documents and settings/all users/application data (hidden)/Retrospect/ config6x.dat and config6x.bak. I would move those files somewhere else on the machine and try the backup again. You will need to set up your script again but you can use your existing backup sets if you like.

 

The other posibilty is that you have a flaky RAM chip. Retrospect can be very memory intensive so it could be filling up your ram more than you normally would, exposing a section of bad RAM that is not accessed very often. I would download a ram test and see.

 


 

OK I'll work on this. In fact, I already have a RAM test software to use.

 

I must say, this is looking more like a Retrospect bug than either of the two possibilities you mention. Yes, my config6x.dat file might be corrupted. Yes, my DRAM might be broken. But I'm having this problem when Retrospect is NOT using a lot of memory. It is not backing up or restoring data, after all.

 

I think that the most likely explanation is that during the failure mode, the Retrospect software is repeating an operation every few seconds (or minutes?). It is either testing to see if someone has logged in, or it is simply trying (and failing) to present a dialog box.

 

As soon as someone logs in to the computer, the dialog box appears instantly. So I know that the software is doing this.

 

In either case, I believe that the Retrospect software is failing to release some resource back to the operating system. Most likely this is a Windows "handle" of some sort. In fact, this resource leak may be due to a flaw in a Windows API, rather than any of your programming.

 

Consider this a bug report. I request that Dantz investigate this. If there is a more formal way for me to submit a bug report, please advise.

 

Thank you!

Link to comment
Share on other sites

Hi

 

 

 

I look forward to hearing your results. The only problem I have with the "Retrospect bug theory" is the fact that no one else seems to be having the same problem. Why would it fail in just your environment?

 

 

 

I just thought of something... Have you adjusted the Retrospect security settings to run as a specified user or is it still running as the "logged in user"? It is a long shot but try telling Retrospect to run as the local Admin account.

 

 

 

Thanks

 

Nate

Link to comment
Share on other sites

  • 2 months later...

I tested 6.0 Professional on a Windows 2000 Professional machine about a year ago and ran across this same memory leak and was unable to resolve it. It would crash the computer after a while even when no scripts were running, just the Retrospect process running. I have an IDE hard drive and a HP SCSI Travan tape drive. Recently I came into the office on Monday morning and found the computer out of memory again with a message that my trial had expired. There were no scripts running and no tapes in the drive.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...