Lars
Lars

Reputation: 357

bash script to find all files where date in filename is newer than current date and time

I'm new to bash so I'm having a bit of trouble coming up with the correct syntax for this. But the gist is that I need to loop through all of the files in a certain directory and print the one's where the date and time in the filename are in the future.

Filename syntax is as follows:

2016-04-27_19EST-KST.txt

2016-04-28_02EST-MSK.txt

2016-04-28_03EST-CET.txt

2016-04-28_09EST-EST.txt

2016-04-28_10EST-CST.txt

2016-04-28_12EST-PST.txt

The hour of the day being in military format and where I don't care about the last "EST-CST" portion.

This is what I've got so far:

#!/bin/bash

curdate=$(date +%Y-%m-%d)
curtime=$(date +%H)

for fn in *.txt;do
        [ "${curdate}_${curtime}*.txt" "<" "$fn" ] && continue
        echo "$fn"
done

This just returns all files in the directory. What I'm I doing wrong here?

Upvotes: 3

Views: 4699

Answers (3)

John1024
John1024

Reputation: 113814

This script will print all files whose names represent dates newer than the present:

#!/bin/bash
d="$(date +%Y-%m-%d_%HZZZ.tmp.txt)"
touch "$d"
printf "%s\n" *.txt | LC_ALL=C sort | awk -v d="$d" 'f{print} d==$0{f=1}'
rm "$d"

How it works

  • d="$(date +%Y-%m-%d_%HZZZ.tmp.txt)"

    This creates a name that matches the current time.

  • touch "$d"

    This creates a file with name matching the current time.

  • printf "%s\n" *.txt

    This prints all *.txt filenames, one per line, to stdout.

    (Your files all have sensible names. If any were to have newlines in their names, then we would want to replace this newline-separated list with a null-separated list and make appropriate changes to the code below.)

  • LC_ALL=C sort

    This sorts the file names into order. I specify LC_ALL=C so to assure that the order is what one expects.

    At this point, note the ZZZ that we used in the temporary file name. This assures that temporary file sorts after any other files with the current hour. If we wanted to include files with the current hour in the output, we might use AAA instead.

  • awk -v d="$d" 'f{print} d==$0{f=1}'

    This code prints all file names after our temporary with the current time.

  • rm "$d"

    This deletes the no-longer-needed temporary file.

Improvement

If the above script were to be interrupted in progress, it would leave a stray file in the directory. This might happen if, for example, Ctrl-C was pressed while the script was running. To make sure that that file is always removed, we can use a trap:

#!/bin/bash
d="$(date +%Y-%m-%d_%HZZZ.tmp.txt)"
trap 'rm "$d"' EXIT
touch "$d"
printf "%s\n" *.txt | LC_ALL=C sort | awk -v d="$d" 'f{print} d==$0{f=1}'

Upvotes: 2

drewyupdrew
drewyupdrew

Reputation: 1609

At some point you might need to parse the date using awk or sed or something. You can try something like this:

#!/bin/bash

curdate=$(date +%Y-%m-%d)
curtime=$(date +%H)

for fn in *.txt;do
    date=$(echo $fn | sed 's/EST.*.txt//')
    if [[ "${curdate}_${curtime}" < "$date" ]]; then
        echo $fn
    fi
done

NOTE: Please keep in mind that the sed command is quite specific. This will only work if the filenames will always be in the format you posted. You can adjust it though to suit your needs.

Upvotes: 1

jil
jil

Reputation: 2691

Your solution is close but it is better to use numerical comparison to compare the dates and normalize the date representation using date, e.g.:

#!/bin/bash
curdate=$(date +%Y%m%d%H)

for fn in *.txt; do
    tmp=${fn%%EST*}  # remove timezone info, ignored
    fdate=$(date -d "${tmp/_/ }" +%Y%m%d%H)
    (( curdate < fdate )) && echo "$fn"
done

Upvotes: 2

Related Questions