jkids
jkids

Reputation: 3

Editing datetime in csv to adjust time zone

We routinely get csv files containing datetime values in certain columns that are always GMT.
We are looking for a way to change the datetime values from GMT to a time zone of our choice - the desired target time zone may differ from one csv to the next. The adjustment would need to account for DST also.

Datetime Format from csv:

YYYY/MM/DD HH:MM:SS

Sample data:

col1,col2,col3,col4
aaa,bbb,2016/01/15 22:01:16,ccc
ddd,eee,,fff
hhh,iii,2014/09/19 00:53:37,jjj
kkk,lll,2015/11/15 22:01:16,mmm
nnn,ooo,2015/10/12 19:54:59,ppp

For example, if we want to adjust the sample data above from GMT to Pacific time (GMT -8 standard), our desired result would be a csv with values as follows:

col1,col2,col3,col4
aaa,bbb,2016/01/15 14:01:16,ccc
ddd,eee,,fff
hhh,iii,2014/09/18 17:53:37,jjj
kkk,lll,2015/11/15 14:01:16,mmm
nnn,ooo,2015/10/12 12:54:59,ppp

Note re DST: for the 5 lines of sample data above, DST was active for dates on lines 3 and 5 only. The adjustment may be different from line to line, but is consistent in terms of the target time zone equivalent (Pacific).

The datetime format itself can stay as is - we need only to adjust the datetime value for time zone and ultimately store the equivalent datetime in Eastern, Pacific, (or any other time zone) instead of GMT.

If possible we would like to leverage editing tools available natively in Ubuntu like awk as we already have cleanup routines that utilize it. Will consider other solutions if an awk or similar solution isn't possible.

Other notes:

Any insight is appreciated - thanks!

EDIT:

In researching this, I've found that a statement like this:

echo "1/15/2016  10:01:16 GMT" | awk -v q='"' '{cmd="TZ=America/Los_Angeles date -d"q$0 q" +"q"%F %H:%M:%S %Z"q; cmd|getline x; close(cmd);print x}'

...is somewhat a proof of concept, but 1) I had to add "GMT" to the string myself, and 2) the output date format is slightly different.
I'm hoping to find a solution I can apply to csv's that may have thousands of rows.

Upvotes: 0

Views: 2298

Answers (2)

Ed Morton
Ed Morton

Reputation: 203368

With GNU awk for time functions:

$ cat tst.awk
function dt2utcSecs(dateTime,   cmd,line,ret) {
    cmd = "TZ=UTC gawk -v dt='" dateTime "' 'BEGIN{print mktime(dt)}'"
    ret = ( (cmd | getline line) > 0 ? line : -1 )
    close(cmd)
    return ret
}
BEGIN{
    FS=OFS=","
    split(cols,f)
}
{
    for (i in f) {
        dateTime = gensub(/[\/:]/," ","g",$(f[i]))
        utcSecs = dt2utcSecs(dateTime)
        if (utcSecs >= 0) {
            $(f[i]) = strftime("%Y/%m/%d %T",utcSecs)
        }
    }
    print
}

$ TZ='US/Pacific' gawk -v cols=3 -f tst.awk file
col1,col2,col3,col4
aaa,bbb,2016/01/15 14:01:16,ccc
ddd,eee,,fff
hhh,iii,2014/09/18 17:53:37,jjj
kkk,lll,2015/11/15 14:01:16,mmm
nnn,ooo,2015/10/12 12:54:59,ppp

Just set cols=3,5,9 to do the conversion on those fields. See /usr/share/zoneinfo for a list of valid time zones.

Note that @webb's answer should be more efficient than the above since the above calls shell to call gawk once per input dateTime field while @webbs just calls gawk twice.

Upvotes: 1

webb
webb

Reputation: 4340

here's an amusing way:

first, convert date strings to numeric, utc timestamps, then, convert the numeric timestamps to local date strings:

TZ=UTC awk -F, '
  BEGIN{OFS=","}
  { if(NR>1&&$3){
      gsub("[/:]"," ",$3);
      $3=mktime($3" GMT")};
    print $0
  }' infile.csv | awk  -F, '
  BEGIN{OFS=","}
  { if(NR>1&&$3){
      $3=strftime("%Y/%m/%d %H:%M:%S %Z", $3, 0)};
    print $0
  }' > outfile.csv

output:

col1,col2,col3,col4
aaa,bbb,2016/01/15 14:01:16 PST,ccc
ddd,eee,,fff
hhh,iii,2014/09/18 17:53:37 PDT,jjj
kkk,lll,2015/11/15 14:01:16 PST,mmm
nnn,ooo,2015/10/12 12:54:59 PDT,ppp

note 1: you can remove the timezone from the output by removing the %Z in the second awk, but if you do that, you're only hurting your future self.

note 2: this may or may not work depending on your version of awk. if your system has gawk, try that in place of awk. if not, it should be easy to install gawk.

Upvotes: 1

Related Questions