Reputation: 157
I have a file containing about 1000 lines that are pretty much like this:
0,23423423,7ds5dsfdf,2008-08-03,19:00:01,101,hJ890
1,54645645,f9g8f9gd7,2008-08-03,19:00:20,113,Lg78s
1,54645645,f9g8f9gd7,2008-08-03,19:00:09,108,Lg78s
0,54645645,f9g8f9gd7,2008-08-03,19:00:01,130,dsf98
1,54645645,f9g8f9gd7,2008-08-03,19:00:20,105,Lg78s
The column after the time represents a number of seconds. How can I make a statistic based on the number of seconds for each date in the file, starting from the smallest one to the largest? For example, I should get something like:
The date Sun Aug 3 19:00:01 EEST 2008 has 231 seconds
The date Sun Aug 3 19:00:09 EEST 2008 has 108 seconds
The date Sun Aug 3 19:00:20 EEST 2008 has 218 seconds
I tried something like this:
while read line
do
date=awk -F "," '{print $4","$5}'
var=grep "$date"
done
After I find an instance of the certain date, how can I select the number of seconds coresponding to it?
Thanks!
Upvotes: 1
Views: 198
Reputation: 133518
Could you please try following awk command and let me know if this helps you. Will add non-one liner form of it too shortly.
awk -F, '{s=$4 " " $5; gsub(/[:-]/, " ", s); t=mktime(s); dt=strftime("%c", t); a[t]=dt; b[t]+=$6} END{for(i in a) print a[i] " has " b[i] " seconds"}' Input_file
Thanks to Anubhava for correcting my code.
Upvotes: 2
Reputation: 785156
You can use this awk
:
awk -F, '{cmd="date -d \"" $4 " " $5 "\""; cmd | getline dt; close(cmd); a[dt] += $6}
END{for (i in a) print i " has " a[i] " seconds"}' file
Sun Aug 3 19:00:09 EDT 2008 has 108 seconds
Sun Aug 3 19:00:20 EDT 2008 has 218 seconds
Sun Aug 3 19:00:01 EDT 2008 has 231 seconds
This awk
command
- uses comma as input field separator.
- constructs a date string uses column 4th and 5th columns.
- uses an associative array with key as date and value as sum of seconds value
Reference: Effective AWK Programming
If you want dates to be sorted then use awk + sort + cut
as this one:
awk -F, '{s=$4 " " $5; cmd="date -d \"" s "\""; cmd | getline dt; close(cmd);
a[dt] += $6; b[dt]=s} END{for (i in a) print b[i] ";" i " has " a[i] " seconds"}' file |
sort -t ';' -k1,2 |
cut -d ';' -f2-
Sun Aug 3 19:00:01 EDT 2008 has 231 seconds
Sun Aug 3 19:00:09 EDT 2008 has 108 seconds
Sun Aug 3 19:00:20 EDT 2008 has 218 seconds
Upvotes: 4