Rocky
Rocky

Reputation: 129

How to get file names and file size from tar files using shell script

I am working for a file validations in shell script where we need to read file size and file names from a tar file and validate the file size is greater than particular size and also the tar file contains all the list of mandatory files.
Below is the code I have written, is there any way we can check file name and file size at a single time one loop.

any inputs is appreciated. Thanks in advance...

#!/bin/bash
minimumsize=90000
mandatoryFiles=(party.dat test1.dat test2.dat) 


ALL_FILE_NAMES=`tar -tvf testData/test_daily.tgz | awk '{print $6}' `  #get file names in tar file
ALL_FILE_SIZES=`tar -tvf testData/test_daily.tgz | awk '{print $3}' `  #get file sizez in bytes

echo "File names :::::::::::"$ALL_FILE_NAMES
echo "File sizes :::::::::::"$ALL_FILE_SIZES



#condition to check file size is greater than minimum size
for actualsize in $ALL_FILE_SIZES; do

   if [ $actualsize -ge $minimumsize ]; then
          echo size is over $minimumsize bytes
   else
         echo size is under $minimumsize bytes
         exit 0
   fi
done


#condition to check all the mandatory files are included in the taz file.
for afile in $ALL_FILE_NAMES; do

    if [[ ${mandatoryFiles[*]} =~ $afile ]]; then
        echo $afile is present
    else   
        echo $afile not present so existing the bash
            exit 0
    fi 

done

Upvotes: 0

Views: 853

Answers (1)

David C. Rankin
David C. Rankin

Reputation: 84662

Your approach is a bit awkward. You only need to capture all filenames to loop over the mandatory files, but you cannot do your check the way you are doing or any additional files in your archive (beyond mandatory files) will cause your test to fail.

A cleaner approach is to use process substitution to feed the size and filename to a loop allowing you to test each of the file sizes (any file less than minimumsize will cause the archive to fail), while you fill your array of all_names. You are done with the read loop at that point.

A final loop over all_names checking if they exist in mandatoryFiles and incrementing a counter will allow you to check if there was a match against each of the mandatoryFiles.

One approach would be:

#!/bin/bash

fname="${1:-testData/test_daily.tgz}"           ## filename to read
minimumsize=90000                               ## min size
mandatoryFiles=(party.dat test1.dat test2.dat)  ## mandatory files
declare -a all_names                            ## all_names array
declare -i mandatory_count=0;                   ## mandatory count

while read -r size name; do         ## read/compare sizes, fill array

    all_names+=( "${name##*/}" );   ## store each file name in array w/o path

    #condition to check file size is greater than minimum size
    if [ "$size" -ge $minimumsize ]; then
        echo "$size is over $minimumsize bytes"
    else
        echo "$size is under $minimumsize bytes"
        exit 0
    fi

done < <(tar -tzvf "$fname" | awk '{print $3, $6}')

#condition to check all the mandatory files are included in the taz file.
for afile in "${all_names[@]}"; do

    if [[ ${mandatoryFiles[@]} =~ "$afile" ]]; then
        ((mandatory_count++))   ## increment mandatory_count
    fi 

done

## test if mandatory_count less than number of mandatory files
if [ "$mandatory_count" -lt "${#mandatoryFiles[@]}" ]; then
    echo "mandatoryFiles not present - exiting"
    exit 1
fi

echo "all files good"

(note: if the file is a .tgz (g-zipped tar archive), you need to add the 'z' option as done above)

Look things over and let me know if you have further questions.

Upvotes: 2

Related Questions