Jump to content

Subtleties of Snapshots Sessions & Selectors

Recommended Posts

I recently had Retrospect Support clarify some subtleties of how Retrospect works under the skin with respect to Partial Backups excluding data using selectors.

Particularly referring to Retropsects apparent behaviour of restoring stuff that was not backed up.


Support's Comments are highlighted in Red Italics.


Snapshot, Session, Selector Restore issue.



    You have a full backup of a volume in BUSet-Full


    You take a backup of the same volume Excluding some data with a

    selector to BUSet-Partial



    For BUSet Partial Will the Snapshot reflect the complete volume or

    the partial backup according to the selector? /{I Suspect it will be

    the Full volume}/

    [color:red]The snapshot for the partial backup will show all of the folders on the drive even the ones that have been excluded(there will be no data in the folders) and if you do a full volume restore it will restore those folders.[/color]


    Confirm that the Session will contain only the partial data. /{I

    suspect it will}/

[color:red]The session data will show only what was copied for a given date.[/color]


    If the Snapshot reflects the Full volume, and the Session contains

    only the partial data, and I do a restore based on the snapshot in

    BUSet-Partial, will Retrospect attempt to find the files missing

    from the session in BUSet-Partial, by looking in BUSet-Full.

    {I suspect it does}

[color:red]It will not try and restore the missing data from the BUSet-full, it will restore the empty folders from the snapshot(this is only if you do a full volume restore).[/color]


    When I Run the Partial backup it appears that Retrospect scans the

    Entire volume first, THEN applies the Selector to copy the partial

    data to the Session. Is this the case? /{I Suspect it IS}/

[color:red]This is the expected behavior, it has to do a full scan of the drive first to see what is on the drive before it can apply the include or excludes of the selector.[/color]


    IF SO is there a way of preventing this.

    It is particularly a problem in the following scenario:

    I have a poorly performing server with 2 Million files, a full scan

    takes over 24Hours.

    I want to backup a small portion of the data, so I setup a selector

    to exclude most of the data, when I run the backup, the scan STILL

    takes 24 hours, and then the actual file copy is quick for the small

    amount of data selected.

[color:red]Unfortunately there is no way of preventing this, I can put in a feature request to improve the selector and scanning process. I can not guarantee anything however as that would require a major change in the program on how we scan the drive and determine what to exclude or include.[/color]


Share this post

Link to post
Share on other sites

We've asked for a change to stop these full scans from being necessary before. i.e. possibly an indexing service which keeps tabs on folder names.


We have the same issue where we have medium powered laptops. We do software developement, so quite often there are lots of small files and folders on the machine, yet NOT in within a selector.


A proactive backup jumps on a laptop, hammers it with a full scan, applies the selectors, backs up the data and then builds a snapshot (which also hammers the laptop). The backup verosity can be controlled by the slider in the client, yet it doesn't seem to apply during the scan/snapshot phase, which can make the laptop unusable.


We also regalarly see proactive backups of just 10MB in size taking an hour odd, 2 minutes backing up, 58 minutes scanning/thrashing.



- 17/03/2010 08:03:27: Copying SW_Preload (C:) on T**y ****

17/03/2010 09:10:15: Snapshot stored, 336.8 MB

17/03/2010 09:11:15: Execution completed successfully

Completed: 3041 files, 55.5 MB, with 56% compression

Performance: 26.8 MB/minute

Duration: 01:07:47 (01:05:42 idle/loading/preparing)


Never heard of the term partial backups, is this a normal backup in RS terms? If that's the case the way I visualise it working is this:


- Machine is added to RS

- Machine is backed up with all content chosen in selector

- Next backup only changed data is copied, unfortunatley, that include large PST files that may have one additonal email.

- Next backup is also just changed data etc etc


When you come to restore the data, you choose the snapshot of the data you wish to restore, remembering you may have multiple snapshots per client (to go back in file version history). RS will then scan the catalogue file and rebuild a list of all the files that need to be restored to take it back to the point in time when the snapshot was taken.


Hit restore and all the data is copied back in place. (or just choose files/folders or even a different destination).


My apologies if this doesn't make much sense.. my brains hurting a bit today..



Share this post

Link to post
Share on other sites

Partial Backups was just a phrase I coined to refer to a backup of a volume, with selectors excluding part(s) of the volume, leaving a partial backup of the volume.



You mentioned the issue of backing up large PST's with only a few changed mails. Have you considered the Retrospect Continuous Backup add-on. It uses a block level approach which avoids that problem.


See this post http://forums.dantz.com/showtopic.php?tid/27252/pid/106677/post/last/m/1/

Share this post

Link to post
Share on other sites

Are you sure it uses a block-level backup approach? We've been told Retrospect guys are still working on this to do normal backups, so I'd find it unlikely that continuous backup does.


i.e. those specific words aren't used in the post (linked) so when terms like 'only changed data' is used, I would assume it means only changed files are copied.


Also, continious backup seems to not be listed as an add-on here?! http://www.retrospect.com/products/software/retroforwin/addons/


Maybe it's been dropped by Retrospect or possible taken by the Networker backup solution?


Can anyone confirm this?




edit: from reading this, I may well be wrong...




Strange how common block-level or alpha backup terms aren't used and how if they had this technology they haven't managed to integrate it with Retrospect Backup yet (i.e. as a paid add-on outside of continuous).


Reading more into it, it looks like a seperate product they've integrated with, as there is a client and server component (i.e. not Retrospect client). Funnily it's not listed as a 7.7 download, but only a 7.6 one?



Edited by Guest

Share this post

Link to post
Share on other sites

Yes, it is pretty much a separate product, and it seems to have it's own version stream.


I have deployed it at a customer specifically to address a PST backup problem, and it works fine.

(They use an external Pop mail service and so their only local copy of mail is in their local PSTS on the Laptops & Desktops)

It is actually a very cute product.

The Workstation version each user manages his own setup. The Server implements a centralised management and policy service, where the configs are specified and deployed centrally, and user interaction can be controlled form nothing to full.



Share this post

Link to post
Share on other sites

Further, not only is it block level, it is continuous. This means that each changed block is backed as soon as it changes. There is no Schedule, No "Scan", No "Hammering the Drive" thus No Performance impact (well very little).


Yes the documentation is a little sketchy try searching the Retrospect KB for "continuous" there a couple of white papers in there. Better yet download the workstation version and give it a go, there's a free trial period of 30 days.

Share this post

Link to post
Share on other sites

It's just a shame RS couldn't intergrate this into the backup tool, sounds like it could be a useful thing to have and would bring them back into the 21st century.



Share this post

Link to post
Share on other sites

When you say that a Full Scan of your 2 million files is taking 24 hours to complete, is that the actual scanning or is it the scanning plus the "matching" which RS does? When I do a normal backup of my backup set that has 1.9 million files, the scanning and matching takes a little over an hour, if I have the "match only files in the same location" turned OFF (which is the default). If I run that normal backup on the same backup set with the "match only files in the same location" turned ON, that backup will take up to 7 hours to complete. I would say 90% of that time is spent "matching".


So I'm asking how you have that "match only files in the same location" parameter set.


Other questions would be: Why is that server's performance so bad? hard drive speed? network bandwidth? CPU? fragmentation?


That backup scanning seems abnormally slow.

Share this post

Link to post
Share on other sites

This was back in Dec last year, so I cannot remember exactly whether it was the Scanning or Matching. It was probably the scanning.


The file system in question is pretty dense (Large # files & Small volume) which is a challenge for any backup, but Yes I know the performance was awful, and we eventually discovered that the problem was in the MB onboard raid controller. Once we {Eventually after 5 days} got a full backup of the server we rebuilt it with an Adaptec Raid card, which improved the raid performance by about 7 times. Since then we have divided the volume up into 6 subvolumes, and with the improved performance backups are now going fine.


That said, at the time the issues highlighted the way Retrospect works under the skin, and once understood allows you to design / tune your config in a better way. So I thought I would share my findings.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now