Reputation: 123
I have a data frame with information from a broker, the data show the distance traveled, time spent and the date of the race.
DISTANCE TIME DATE
9.8 51:50.92 09/08/18
11.1 55:59.70 24/08/18
11.1 49:25.20 29/08/18
I would like to perform an arithmetic operation between the records in the "TIME" column as differences between the times, calculate the average speed of each race.
When transforming the text string to pandas datetime it remains with the following format:
1900-01-01 00:51:50
1900-01-01 00:55:59
1900-01-01 00:49:25
What I need is the time so I do the following
df['TIME'] = pd.to_datetime(df['TIME'], format = '%M:%S')
Which leads me to the next problem, the column is now of type "object" and I can't perform operations of any kind on it.
How can I transform a column to time without losing the datetime format? and perform operations with it.
I would like, for example, to calculate speed, differences between times and frequencies between races.
Upvotes: 0
Views: 1460
Reputation: 141
Supposing that Distance is km, use this to get your speed in km/h:
import pandas as pd
df=pd.DataFrame({'DISTANCE':[9.5,11.1,11.1],'TIME':['51:50.92','55:59.70','49:25.20'],'DATE':['09/08/18','24/08/18','29/08/18']})
df['TIME'] = '00:' + df['TIME']
df['TIME'] = pd.to_timedelta(df['TIME'])
df['SPEED (km/h)']=[df['DISTANCE'][i]/(df['TIME'][i].total_seconds()/3600) for i in df.index]#total_seconds() won't work with a pandas column so you need to apply it to each TIME cell
Upvotes: 1
Reputation: 374
Instead of to_datetime
, you could use to_timedelta
. See below:
df['TIME'] = '00:' + df['TIME']
df['TIME'] = pd.to_timedelta(df['TIME'])
The result of this is:
DISTANCE TIME DATE
0 9.8 00:51:50.920000 09/08/18
1 11.1 00:55:59.700000 24/08/18
2 11.1 00:49:25.200000 29/08/18
You can now use mathematical operations on TIME
. However, as mentioned in Phung Duy Phong's comment above, it's hard to say whether this will solve your issues without seeing the code you'd like to run.
Upvotes: 2