Jump to content

Better Instant Scan Implementation for Windows


Recommended Posts

The Instant Scan FAQ (http://www.retrospect.com/en/support/kb/retrospect-instascan-faq) does a good job of explaining how Instant Scan is implemented and describes some of its problems and limitations. 

 

The main problems described are:

1. RetroISA can use a high CPU load.  Even after the initial scan, the service's use of CPU can spike degrading the performance of other applications.

 

2. Recently modified files may be missed from a backup if the service hasn't gotten around to scanning them. 

 

Additionally, I've had the RetroISA service block a chkdsk /f because it had an open handle on the volume.  I wouldn't be surprised if it also interfered with "safe removal" of an external drive.

 

I don't know about the Macintosh platform, but I think this can be done better on Windows. 

 

The FAQ indicates that Instant Scan only works on NTFS formatted drives in Windows because it makes use of the USN journal.  Since the USN journal is a requirement, doing all this background scanning is largely unnecessary.

 

Background: The USN journal maintains a list of changed, added, deleted and moved files.  The old way of doing an incremental backup (in general for any backup application) was to scan the entire disk and compare against the backup catalog looking for changes.  The USN journal removed this need.  Instead, a backup application only needs to check the timestamp of the last backup, then read the USN journal from that time on.  Only if the USN journal wrapped does a full scan need to be done. 

 

I've done tests with a different backup product on a 6 TB disk that had millions of files. 

Time to scan without USN journal: 2 Hours

Time to scan with USN journal: 5 minutes

 

This other product does not do pre-scanning like Retrospect's Instant Scan.  My point is that just using the USN journal as it was designed offers a huge benefit and you avoid the problems that Instant Scan currently imposes. 

 

However, I can see some benefit to doing a pre-scan.  In the case of the very first backup, a full scan would likely be required without Instant Scan.

 

So, my recommendations for Instant Scan improvement on the Windows Platform as follows...

 

Keep the RetroISA service, but offer different levels of Instance Scan performance options.

 

1. Normal: After detection of a new fixed disk, RetroISA should do a full scan saving the cache data as it does now.  In this mode, RetroISA should not do any further background scanning of the volume after the initial scan.  This immediately eliminates the problem of RetroISA degrading system performance. It also reduces the time when RetroISA can interfere with chkdsk /f or device removal (but see below for a discussion of how to fix this completely.)  Then when a backup runs, the main Retrospect application should activate RetroISA asking it to perform a refresh (and wait for it to complete).  This would still be very quick in most circumstances and it would avoid the problem of newly changed files not being included in the backup. This mode would provide a nice balance.

 

2. Minimal: RestroISA does no scanning on its own.  Scanning only occurs when a backup runs.  This would be closest to the basic implementation of backing up with the USN journal.  When the first full backup runs, a full scan should be done on the volume.  For all subsequent backups, the backup should ask RetroISA to refresh the cache using the USN journal and wait for the results.  There is no background scanning in this mode.  There is no pre-scan in this mode.

 

3. Disabled: For those users that don't want the Instant Scan cache files on their systems.

 

4. Aggressive: Instant Scan operates as it does now with periodic background scanning.  This would eliminate even the 5 minute wait in my previous example.  One caveat is that Instant Scan should be a better citizen when doing background scanning (It should also be like this when doing initial background scanning in Normal mode.)

 

What do I mean by being a good citizen?  If the user wants to run a chkdsk /f, get out of the way.  If the user wants to remove an external device, get out of the way.  The WM_DEVICECHANGE_NOTIFICATION can be used to know when one of these events is occurring.  An application/service calls RegisterDeviceNotification in order to receive it.  Microsoft provides an example of using WM_DEVICECHANGE_NOTIFICATION to handle the case of an external device being removed.

 

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363427%28v=vs.85%29.aspx

 

For the chkdsk /f case, things are slightly different.  WM_DEVICECHANGE_NOTIFICATION is still used, but the event that is sent is DBT_CUSTOMEVENT with an event GUID of GUID_IO_VOLUME_LOCK.  I couldn't find an online sample of this, but I did write sample code myself, as test, before writing this post.  I would be happy to provide this code to you.

 

I know I'm not a Retrospect product manager, but I hope you'll give consideration to making these improvements to Instant Scan.  Right now, Instant Scan is a step toward of a good goal, but there's no reason why the problems it currently has need to exist.  If you don't want to fully implement all suggested modes, then please just consider implementing either the "Normal" or "Minimal" mode suggestions. 

 

Thanks.

 

Link to comment
Share on other sites

The USN journal removed this need.  Instead, a backup application only needs to check the timestamp of the last backup, then read the USN journal from that time on.

 

What if you have two backup sets (that inevitably has different backup dates)? Don't you miss a lot of files for the second backup (to the other backup set)?

Link to comment
Share on other sites

Thanks very much for your thoughtful suggestions! They're spot on, both in terms of the current issues with Instant Scan and how it can be improved. We'll include this in our discussions on the next steps for the feature. Thanks again!


 


@Thelander Good point. We've discussed storing a USN journal timestamp per set as a way around this.


Link to comment
Share on other sites

What if you have two backup sets (that inevitably has different backup dates)? Don't you miss a lot of files for the second backup (to the other backup set)?

 

No that wouldn't be a problem.  My original example was simplified to one backup set, but the USN journal works fine with multiple backup sets. 

 

The USN journal is just a timestamped list of file changes.  While it does rollover when it reaches maximum size, it never gets cleared by a backup.  So, if you run a backup today to one backup set, the USN journal will still have the necessary information to allow an incremental to be run on an older backup set without missing anything.

Link to comment
Share on other sites

The USN journal is just a timestamped list of file changes.

For systems that dual boot with a *nix OS, does anyone know to what extent the USN Journal is supported by *nix OS. I have only done a quick search on this but could not find anything definitive.

 

I have two systems that dual boot, one to Linux Mint 17.1 and one to PCBSD 10.1, where the backup is done under Windows but some data processing is done under the other OS. If the *nix is not updating the USN Journal then reliance on this could be problematic.

Link to comment
Share on other sites

I have two systems that dual boot, one to Linux Mint 17.1 and one to PCBSD 10.1, where the backup is done under Windows but some data processing is done under the other OS. If the *nix is not updating the USN Journal then reliance on this could be problematic.

 

The USN Journal is already required for Instant Scan on Windows.  If the *nix operating systems are not playing nice with it, I'd expect problems to occur with the existing implementation.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...