Reputation: 29
I have a file containing this:
2014-11-10 02:00:03,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
2014-11-10 02:00:07,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
2014-11-10 02:00:11,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
I want to convert the timestamp to a date in this format:
1415602803,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
1415602807,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
1415602811,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
How can I achieve this using awk or any other bash tool.
Background: I need to grab files from hdfs for a particular time range. Initially I was using hdfs dfs -stat to bring epoch time and then programming a filter
cat <fileWithFilePaths> | xargs -I ^ -P 15 sh -c 'printf "%s\n" "$(hdfs dfs -stat "%n,%Y,%b" ^)"'>output1
awk 'BEGIN{beginDE=ENVIRON["beginDataEpoch"]; endDE=ENVIRON["endDataEpoch"]; FS="@"} {if ($1 >= beginDE && $1 <= endDE) print $0}' output1
But later realized its using hdfs stat is expensive function. So now I am using hdfs dfs -ls to bring the file time but its in yyyy-mm-dd hh:mm format, thus I am seeking help in regards to change this in epoch time for calculations down the line.
Upvotes: 0
Views: 1910
Reputation: 51
I had the same issue, when I needed to convert the date to unix time in some historical stock data. Used this sed
and date
command. Please note, that this only works on Linux (GNU implementation of sed
and date
).
simon@debian:~/Downloads$ cat inputFile
2014-11-10 02:00:03,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
2014-11-10 02:00:07,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
2014-11-10 02:00:11,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
simon@debian:~/Downloads$
simon@debian:~/Downloads$
simon@debian:~/Downloads$
simon@debian:~/Downloads$ sed 's/20[0-9][0-9]-[0-9][0-9]-[0-9][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]/echo `date --date="&" +"%s"`/e' inputFile > output
simon@debian:~/Downloads$
simon@debian:~/Downloads$
simon@debian:~/Downloads$
simon@debian:~/Downloads$ cat output
1415602803,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
1415602807,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
1415602811,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
Upvotes: 0
Reputation: 84551
You can use awk
along with its builtin gensub
, mktime
and sub
to accomplish the same thing, e.g.
$ awk -F, '{ str=$0; sub($1, mktime(gensub(/[-:]/, " ", "g", $1)), str); print str }' file
1415606403,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
1415606407,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
1415606411,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
Explanation:
gensub(/[-:]/, " ", "g", $1)
- replace all '-'
and ':'
in the first field with spaces;mktime(...)
- turn the resulting date into a timestamp;str=$0; sub($1, ..., str);
- replace the first field with the timestamp, result in str
; and finallyprint str
- print it. (note: Your timezone and my are 3-hours different. Your output should give you your desired results on your system. You can also adjust mktime
for Daylight Savings (if needed) and you can use the builtin strftime
with the utc-flag
to adjust for UTC.)
Upvotes: 1
Reputation: 6134
If your log file is large, this GNU awk
command will be much faster than a bash
loop calling the external command date
for each line:
gawk -vFS=, -vOFS=, '{ $1 = mktime(gensub("[-:]", " ", "g", $1)) } 1'
mktime
, a function (a GNU extension) that returns its parameter as a number of seconds since EPOCH.gensub
(a GNU extension) is used to turn "YYYY-MM-DD hh:mm:ss
" into "YYYY MM DD hh mm ss
" which is the format expected by mktime()
.1
is a mere TRUE condition that outputs the line.Upvotes: 2
Reputation: 140970
I guess while read loop would be the simplest. The following:
while IFS=, read -r date rest; do
printf "%s,%s\n" "$(date --date="$date" +%s)" "$rest"
done <<EOF
2014-11-10 02:00:03,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
2014-11-10 02:00:07,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
2014-11-10 02:00:11,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
EOF
Will output:
1415581203,LOGIN SUCCESS,AUTH,user2,192.168.203.63,10.146.124.73,59996,22
1415581207,LOGIN SUCCESS,AUTH,user1,172.24.31.10,172.32.1.1,48191,22
1415581211,LOGIN FAILED,AUTH,root,172.24.166.153,10.146.124.73,52506,22
Upvotes: 0