[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ProgSoc] algorithm question
On Wed, Jun 27, 2001 at 09:27:01PM +1000, jedd wrote:
> I have a collection of files that live under a single directory,
> with very many sub-directories, containing a dynamic collection
> (say around 100,000 files). Some are very big (2gig), some are
> very small.
> I want to archive this collection, to a remote machine, via an
> smbclient or nfs mount. I want to do this archiving regularly.
> The box is a Solaris machine with 8 cpu's. My network connection
> is the bottleneck, but I'm also keen on ensuring the archive (as
> kept) is compressed at the other end.
> Therefore, I want to go thru every file, pipe it through gzip, and
> output it straight onto the nfs mount (so I effectively end up with
> an exact replica, except everything on the target has a .gz
> extension). How I want to do it is by going thru every directory,
> recursively, and forking the gzip&pipe processes, but only up to 8
> processes running concurrently .. and that's the bit that I have no
> idea how to control.
> Is there a way of doing this kind of thing in bash ? If not, how
> would you do it in any [other] language anyway? I don't really
> want to spend a huge amount of time fiddling with this thing, but
> it seems to me that it can't be *that* uncommon a requirement - to
> launch X number of tasks on an X-SMP box, and maintain that number
> of tasks.
why not use rsync? it does recursion, minimal spanning deltas, and
archiving. and you can get it to connect via ssh with compression
turned on. or you can use nfs if you so wish. as for running multiple
copies at once you could use xargs in conjunction with find set at
some appropriate depth, heck or even bash re..
You are subscribed to the progsoc mailing list. To unsubscribe, send a
message containing "unsubscribe" to firstname.lastname@example.org.
If you are having trouble, ask email@example.com for help.