timestamp data manipulation in unix

Question

I have a csv data file that has two timestamp fields - start_time and end_time. They are strings in the form of "2014-02-01 00:06:22". Each line of the data file is a record with multiple fields. The file is pretty small.

I want to calculate the average duration among all records. Other than using shell scripts, is there any one-liner command that I could use for this kind of simple calculation, possibly using awk?

I'm very new to awk. Here's what I have but does not work. $6 and $7 are fields for start_time and end_time.

awk -F, 'BEGIN { count=0 total=0 }
    { sec1=date +%s -d $6 sec2=date +%s -d $7
    total+=sec2-sec1 count++} 
    END {print "avg trip time: ", total/count}' dataset.csv

Sample of the csv file:

"start_time","stop_time","start station name","end station name","bike_id"
"2014-02-01 00:00:00","2014-02-01 00:06:22","Washington Square E","Stanton St & Chrystie St","21101"

timestamp data manipulation in unix

Answers (1)

Related Questions