gmatteson
gmatteson

Reputation: 79

Linux Shell Scripting - Regex to Filter Filename with Date in it

I have thousands of files in this naming format:

cdr_ABSHCECLUSTER_02_201709072214_987392

I am using the batch script below but what I have found is, it will relocate files based on the modified date, not when they actually were created. How can I modify this to pull the year, month from the file name?

Since files can be moved around, I discovered that the files can be put in the wrong directories based on the 'modification date' instead of the creation date.

stat shows options: access modified changed

 for dir in /sftphome/*;
 do
    echo "Entering parent directory: " $dir
    cd $dir;
             if [  -d "CDR" ]; then
                    dirpath="$(pwd)/CDR"
                    cd $dirpath

                    echo "Searching CDR directory for files " $dirpath
                    find . -maxdepth 2 -type f |
                            while read file ; do
                                    #Check to see if object is a file or directory. Only copy files.
                                    if [[ ! -d $file ]]; then
                                            year="$(date -d "$(stat -c %y "$file")" +%Y)"
                                            month="$(date -d "$(stat -c %y "$file")" +%b)"

                                            #Create the directories if they don't exist. The -p flag makes 'mkdir' create the parent directories as needed
                                            if [ ! -d "$dirpath/$year/$month" ]; then
                                                    echo "Creating directory structure $dirpath/$year/$month..."
                                                    mkdir -p "$dirpath/$year/$month";
                                                    echo "Directory $dirpath/$year/$month created."
                                            fi

                                            echo "Relocating $dirpath/$file to $dirpath/$year/$month"
                                            cp -p $file "$dirpath/$year/$month"
                                            rm -f $file
                                    fi
                            done
                            echo "Relocation of all files in $dirpath is complete."
             el

I would appreciate any insight. Thank you!

Upvotes: 0

Views: 1107

Answers (2)

Wilfredo Pomier
Wilfredo Pomier

Reputation: 1121

This script is all you need.

find /sftphome/*/CDR -type f -maxdepth 2 | 
    while read file 
    do
        date=`basename "$file" | cut -d_ -f4`
        newdir="$(cut -d/ -f-4 <<< "$file")/${date:0:4}/${date:4:2}"
        mkdir -p "$newdir"
        mv -f "$file" "$newdir"
    done

Edit:

I just noticed the %b date format. If that is a MUST (which I wouldn't recommend as it's hard to sort), then replace the newdir=... line with:

newdir="$(cut -d/ -f-4 <<< "$file")/$(date -d${date:0:4}-${date:4:2}-01 +%Y/%b)"

Upvotes: 1

markp-fuso
markp-fuso

Reputation: 33994

Here's one method for populating the year and month variables from the date stamp in the file name ...

Start with our filename in the variable file ...

file=cdr_ABSHCECLUSTER_02_201709072214_987392

Using the underscore (_) as a delimiter, break file into separate strings, placing into an array named ar; we'll loop through the array just to show the components ...

IFS='_' read -ra ar <<< "${file}"
for i in "${!ar[@]}"
do
    echo "ar[${i}] = ${ar[${i}]}"
done

# output from for loop:

ar[0] = cdr
ar[1] = ABSHCECLUSTER
ar[2] = 02
ar[3] = 201709072214
ar[4] = 987392

We'll parse ar[3] to get our year and month values ...

year=${ar[3]:0:4}     # 4-digit year  = substring from position 0 for 4 characters
mo=${ar[3]:4:2}       # 2-digit month = substring from position 4 for 2 characters
echo "year=${year} , mo=${mo}"

# output from echo command:

year=2017, mo=09

But your script wants month in Mmm format (date +%b), so a small adjustment ...

# convert our 2-character month to a 3-character 'Mon'th

month=$(date -d "${mo}" +%b)

# confirm our variables:

echo "year=${year} ; month=${month}"

# output from echo command:

year=2017 ; month=Sep

At this point we've populated the year and month variables from the date stamp in the file name, and now you can continue with the rest of your script.

Putting it all together:

# once the 'file' variable is populated:

IFS='_' read -ra ar <<< "${file}"
year=${ar[3]:0:4}
mo=${ar[3]:4:2}
month=$(date -d "${mo}" +%b)

Upvotes: 1

Related Questions