Problems with 1000's of small files

I have Retrospect 6.5 on my Win 2000 Server and I am using a Sony DDS4 8 tape autoloader to backup. I have a few directories that have over 5000 -15000 plain text files that are about 1-2k each. I noticed that when Retrospect hits these directories the tape stops writing for a few seconds, then starts, then stops, then starts, etc.......


Since these are plain text files they can compress down to nothing so I image the tape keeps stopping because it is waiting for some of the thousands of file to compress and be of a size enough to cause the tape to perform a write.


Unfortunately the stopping and starting of the tape unit is causing the unit to ask for a cleaning after every backup (45 gigs) and may also cause the unit to wear out quickly. I have seen many units wear out quickly when they are continuously stopping and starting like this and I don't want to damage this new expensive unit.


Is there some way to handle situations like this where the tape is stopping and starting a lot during the backup because of the thousands of small files?

The speed slowdown and the tape positioning happens because it takes a lot of time to read each of those tiny files. Compression isn't really a factor especially if you are using hardware compression on the tape drive. You will find that large files backup much faster.


Reading the files from disk is the biggest bottleneck so there isn't much you can do as a work around. You could exclude those files from your regular backup and then back up those files into a file backup set. From there you can tranfer them to tape at a later time. I'm not sure if that will be much faster though.



I'm experiencing similar kind of behavior when transferring separate clients out of Disk Set onto a Tape Set. There's no bandwidth issues here as the Disk Set is on local RAID10 disk array. Even the random writes are above 6MB/s threshold.


The tape stops and goes. Retrospect is doing Please Wait-Updating destination-Updating Catalog File-Copying routine. And, the transfer speed hovers at around 28.0MB/minute... frown.gif...as compared to 400MB/min if whole set is being transfered at once.


I guess I could create intermediate Disk Set with just those clients, but why?


Dantz has to really optimize this...please.







Just a suggestion, as we do this ourselves:


I've set a schedule to winzip an entire directory of small text files into one zip file every night. Then when the backup runs, I've set it to not backup that text file directory, but it does backup the zip file which resides outside of the text file directory. Voila, no slow down.


- Al


But, that does not solve the problem.


I've seen it too. I've seen it slow from 387mB/s to 28-30 on a ton of small text files.


I also have a raid disk device.


I also tried this scenario with Arcserve 11 (Shudder), and the throughput on this directory stayed at 350mB/s


So, there is a real optimization problem with these small files.....

