Reputation: 1
I have about 15GB of assorted files in a directory that need to be compressed into a batch of zipped files, each of which has to be smaller than 500MB. I've got a pretty simple Bash script that gets a maximum number of files per batch and compresses them into Zip files; it works fine, but given how widely the individual files vary in size, I'm ending up with a lot more Zipped files than I'd like.
Is there a relatively efficient Bash script to calculate the number of files to be compressed into each archive, based on a set maximum archive size?
A script that calculates the total size of all files in the "latest batch" and assigns that block of files for compression when a preset total is reached would also work if you've got one, but now I'm curious if there's an optimal solution.
Here's the script I've got, with comments:
#!/bin/bash
# Set the directory path to the folder containing the files to be compressed
DIR_PATH="/path"
# Set the name prefix of the output archive files
ARCHIVE_PREFIX="archive"
# Set the maximum number of files per batch
MAX_FILES=1000
# Change directory to the specified path
cd "$DIR_PATH"
# Get a list of all files in the directory
files=( * )
# Calculate the number of batches of files
num_batches=$(( (${#files[@]} + $MAX_FILES - 1) / $MAX_FILES ))
# Loop through each batch of files
for (( i=0; i<$num_batches; i++ )); do
# Set the start and end indices of the batch
start=$(( $i * $MAX_FILES ))
end=$(( ($i + 1) * $MAX_FILES - 1 ))
# Check if the end index exceeds the number of files
if (( $end >= ${#files[@]} )); then
end=$(( ${#files[@]} - 1 ))
fi
# Create a compressed archive file for the batch of files
archive_name="${ARCHIVE_PREFIX}_${i}.zip"
tar -cvzf "$archive_name" "${files[@]:$start:$MAX_FILES}"
done
Upvotes: 0
Views: 55
Reputation: 19395
A script that calculates the total size of all files in the "latest batch" and assigns that block of files for compression when a preset total is reached would also work
That's rather easy, but requires lucky guessing of that preset total.
… # set needed variables as in your script
archive()
{
# Create a compressed archive file for the batch of files
archive_name="${ARCHIVE_PREFIX}_$((i++)).zip"
echo tar -cvzf "$archive_name" "${files[@]:start:end-start}"
start=$end
}
# Loop through each of files
for file in "${files[@]}"
do let sum+=`stat -c%s "$file"` # accumulate the file sizes
let ++end
if [ 1000000000 -le $sum ]; then # compare with preset total
sum=0
archive
fi
done
if ((end-start)); then archive; fi # final batch
Upvotes: 0
Reputation: 112502
You would have to define "relatively efficient", but the only way to know what the compressed size will be is to do the compression. There is no "calculation" that can be done.
Upvotes: 2