zain ul abidin
zain ul abidin

Reputation: 195

How to set date range in shell script

I am writing a code in a shell script to load data from specific range but it does not stops at the data I want and instead goes past beyond that. Below is my code of shell script.

j=20180329
while [ $j -le 20180404]
do

i have problem that my loop run after the date 20180331 till 20180399 then it go to 20180401. i want it to go from 20180331 to 20180401. not 20180332 and so on

Upvotes: 2

Views: 2526

Answers (2)

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70752

One simple question, 3+ not so short answer...

As your request stand for

1. Compatible answer first

j=20180329
while [ "$j" != "20180405" ] ;do
    echo $j
    j=`date -d "$j +1 day" +%Y%m%d`
done

Note I used one day after as while condition is based on equality! Of course interpreting YYYYMMDD date as integer will work too:

Note 2 Care about timezone set TZ=UTC see issue further...

j=20180329
while [ $j -le 20180404 ] ;do
    echo $j
    j=`TZ=UTC date -d "$j +1 day" +%Y%m%d`
done

But I don't like this because if time format change, this could become an issue.

Tested under and as dash and busybox.
(using date (GNU coreutils) 8.26.

1.2 Minimize fork under POSIX shell

Before using bashisms, here is a way of doing this under any POSIX shell:

The power of POSIX shell is that we could use very simple converter like date and do condition over result:

#!/usr/bin/env sh

tempdir=$(mktemp -d)
datein="$tempdir/datein"
dateout="$tempdir/dateout"
mkfifo "$datein" "$dateout"

exec 5<>"$datein"
exec 6<>"$dateout"
stdbuf -i0 -o0 date -f - <"$datein" >"$dateout" +'%Y%m%d' &
datepid=$!

echo "$2" >&5
read -r end <&6
echo "$1" >&5
read -r crt <&6

while [ "$crt" -le "$end" ];do
    echo $crt
    echo "$crt +1 day" >&5
    read -r crt <&6
done

exec 5>&-
exec 6<&-
kill "$datepid"
rm -fR "$tempdir"

Then

daterange.sh 20180329 20180404
20180329
20180330
20180331
20180401
20180402
20180403
20180404

2. date via printf

Under , you could use so-called bashisms:

Convert date to integer Epoch (Unix time), but two dates via one fork:

{
    read start;
    read end
} < <(date -f - +%s <<eof
20180329
20180404
eof
)

or

start=20180329
end=20180404
{ read start;read end;} < <(date -f - +%s <<<$start$'\n'$end)

Then using builtin printf command (note: there is $[24*60*60] -> 86400 seconds in a regular day)

for (( i=start ; i<=end ; i+=86400 )) ;do
    printf "%(%Y%m%d)T\n" $i
done

3. Timezone issue!!

Warning there is an issue around summer vs winter time:

As a function

dayRange() { 
    local dR_Start dR_End dR_Crt
    { 
        read dR_Start
        read dR_End
    } < <(date -f - +%s <<<${1:-yesterday}$'\n'${2:-tomorrow})
    for ((dR_Crt=dR_Start ; dR_Crt<=dR_End ; dR_Crt+=86400 )) ;do
        printf "%(%Y%m%d)T\n" $dR_Crt
    done
}

Showing issue:

TZ=CET dayRange 20181026 20181030
20181026
20181027
20181028
20181028
20181029

Replacing printf "%(%Y%m%d)T\n" $dR_Crt by printf "%(%Y%m%dT%H%M)T\n" $dR_Crt could help:

20181026T0000
20181027T0000
20181028T0000
20181028T2300
20181029T2300

In order to avoid this issue, you just have to localize TZ=UTC at begin of function:

    local dR_Start dR_End dR_Crt TZ=UTC

Final step for function: Avoiding useless forks

In order to improve performances, I try to reduce forks, avoiding syntax like:

    for day in $(dayRange 20180329 20180404);do ...
    # or
    mapfile range < <(dayRange 20180329 20180404)

I use ability of function to directly set submited variables:

There is my purpose:

dayRange() { # <start> <end> <result varname>
    local dR_Start dR_End dR_Crt dR_Day TZ=UTC
    declare -a dR_Var='()'
    { 
        read dR_Start
        read dR_End
    } < <(date -f - +%s <<<${1:-yesterday}$'\n'${2:-tomorrow})
    for ((dR_Crt=dR_Start ; dR_Crt<=dR_End ; dR_Crt+=86400 )) ;do
        printf -v dR_Day "%(%Y%m%d)T\n" $dR_Crt
        dR_Var+=($dR_Day)
    done
    printf -v ${3:-dRange} "%s" "${dR_Var[*]}"
}

Then quick little bug test:

TZ=CET dayRange 20181026 20181030 bugTest
printf "%s\n" $bugTest 
20181026
20181027
20181028
20181029
20181030

Seem fine. This could be used like:

dayRange 20180329 20180405 myrange
for day in $myrange ;do
    echo "Doing something with string: '$day'."
done

2.2 Alternative using shell-connector

There is a shell function for adding background command in order to reduce forks.

wget https://f-hauri.ch/vrac/shell_connector.sh
. shell_connector.sh

Initiate background date +%Y%m%d and test: @0 must answer 19700101

newConnector /bin/date '-f - +%Y%m%d' @0 19700101

Then

j=20190329
while [ $j -le 20190404 ] ;do
    echo $j; myDate "$j +1 day" j
done

3.++ Little bench

Let's try little 3 years range:

j=20160329
time while [ $j -le 20190328 ] ;do
    echo $j;j=`TZ=UTC date -d "$j +1 day" +%Y%m%d`
done | wc
1095    1095    9855

real    0m1.887s
user    0m0.076s
sys     0m0.208s

More than 1 second on my system... Of course, there are 1095 forks!

time { dayRange 20160329 20190328 foo && printf "%s\n" $foo | wc ;}
1095    1095    9855

real    0m0.061s
user    0m0.024s
sys     0m0.012s

Only 1 fork, then bash builtins -> less than 0.1 seconds...

And with newConnector function:

j=20160329
time while [ $j -le 20190328 ] ;do echo $j
    myDate "$j +1 day" j
  done | wc
   1095    1095    9855

real    0m0.109s
user    0m0.084s
sys     0m0.008s

Not as quick than using builtin integer, but very quick anyway.

Upvotes: 2

KamilCuk
KamilCuk

Reputation: 140960

Store the max and min dates using seconds since epoch. Don't use dates - they are not exact (GMT? UTC? etc.). Use seconds since epoch. Then increment your variable with the number of seconds in a day - ie. 24 * 60 * 60 seconds. In your loop, you can convert the number of seconds since epoch back to human readable date using date --date=@<number>. The following will work with POSIX shell and GNU's date utlity:

from=$(date --date='2018/04/04 00:00:00' +%s)
until=$(date --date='2018/04/07 00:00:00' +%s)

counter="$from"
while [ "$counter" -le "$until" ]; do
    j=$(date --date=@"$counter" +%Y%m%d)

    # do somth with j
    echo $j

    counter=$((counter + 24 * 60 * 60))
done

GNU's date is a little strange when parsing it's --date=FORMAT format string. I suggest to always feed it with %Y/%m/%d %H/%M/%S format string so that it always knows how to parse it.

Upvotes: 0

Related Questions