Felipe
Felipe

Reputation: 7583

How to convert column with millisecond timestamp to date in a file using bash?

I am trying to convert all values from the first column of a file that came in milliseconds timestamp to a date format %Y-%m-%d %H:%M:%S. But the output is Di 26. Mai 15:04:00 CEST 2020000. I am using this command:

cat throughput-vs-latency-40K-16.csv | sed 's/^/echo "/; s/\([0-9]\{10\}\)/`date -d @\1`/; s/$/"/' | bash

and this is the file

"Time","pre_aggregate[0]-IN","pre_aggregate[10]-IN",
1590491460000,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1590491475000,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1590491490000,0,0,0,0,0,0,0,0,
1590491505000,290.51666666666665,290.53333333333336,
1590491535000,1027.15,1027.15,1028.85,1028.8666666666666,
1590491550000,1394.1166666666666,1394.15,1394.15,1394.1333333333334,
1590491565000,1475.5333333333333,1473.3666666666666,

I would like to have the output bellow (please consider only the format and dismiss the correct time conversion):

"Time","pre_aggregate[0]-IN","pre_aggregate[10]-IN",
"2020-05-07 08:05:45",0,0,0,0,0,0,0,0,
"2020-05-07 08:06:45",290.51666666666665,290.53333333333336,
"2020-05-07 08:07:45",1027.15,1027.15,1028.85,1028.8666666666666,
"2020-05-07 08:05:45",1394.1166666666666,1394.15,1394.15,1394.1333333333334,
"2020-05-07 08:08:45",1475.5333333333333,1473.3666666666666,

What am I missing in the sed command to have a date in this format 2020-05-07 08:09:45?

Upvotes: 2

Views: 1361

Answers (3)

KamilCuk
KamilCuk

Reputation: 141473

With awk and strftime, replace first field by formatted time:

awk 'BEGIN{ FS=OFS=","; }  NR>1{ $1=strftime("\"%Y-%m-%d %H:%M:%S\"", $1/1000) } 1'

What am I missing in the sed command to have a date in this format 2020-05-07 08:09:45?

It's impossible to do it in sed. It is "theoretically "possible" to do arithmetic in sed, but the resulting script to handle very big values like 1590491490000 would be very, very, very long. sed can be used for simple regex replacement, sed is not able to "compute" or "convert" the values. Use other tools for such jobs.

sed 's/^/echo "/; s/\([0-9]\{10\}\)/`date -d @\1`/;

sed doesn't understand backticks ` as a command substitution like shell does. Backticks are taken literally in sed. There is a GNU extension to execute the content of replacement string by adding a flag e on the end of s command. Because division by 1000 is equal of removing 3 last digits, in GNU sed you could do:

sed '1!s/^\([0-9]*\)[0-9]\{3\},/date -d@\1 +\\""%Y-%m-%d %T\\"",/e'
#                                                                ^ execute the expression
#                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ get's executed
#                   ^^^^^^^^^^ effectively divides by 1000
#    ^^ execute for all except first line

Using awk with strftime will be magnitudes faster.

Upvotes: 3

anubhava
anubhava

Reputation: 785471

You may use this awk:

awk 'BEGIN {
   FS=OFS=","
}
NR > 1 {
   n = $1/1000
   cmd = "date -d @" n " +\"%Y-%m-%d %T\""
   $1 = "\"" ( (cmd | getline out) > 0 ? out : $1 ) "\""
   close(cmd)
} 1' file

"Time","pre_aggregate[0]-IN","pre_aggregate[10]-IN",
"2020-05-26 07:11:00",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"2020-05-26 07:11:15",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"2020-05-26 07:11:30",0,0,0,0,0,0,0,0,
"2020-05-26 07:11:45",290.51666666666665,290.53333333333336,
"2020-05-26 07:12:15",1027.15,1027.15,1028.85,1028.8666666666666,
"2020-05-26 07:12:30",1394.1166666666666,1394.15,1394.15,1394.1333333333334,
"2020-05-26 07:12:45",1475.5333333333333,1473.3666666666666,

n=$1/1000 is done to convert milli-sec value to second value before calling date command to convert into Y-m-d H:M:S format.

Upvotes: 3

RavinderSingh13
RavinderSingh13

Reputation: 133600

Could you please try following in pure awk.

awk '
BEGIN{
  FS=OFS=","
}
FNR==1{
  print
  next
}
{
  $1 = strftime("%D %T.000",substr($1,1,10))
  gsub("/","-",$1)
}
1
'  Input_file

2nd solution: Adding one more solution which will have " to start and end of date column.

awk '
BEGIN{
  FS=OFS=","
}
FNR==1{
  print
  next
}
{
  $1 = strftime("\"%D %T.000\"",substr($1,1,10))
  gsub("/","-",$1)
}
1
' Input_file

Upvotes: 1

Related Questions