Reputation: 37
I've just started my first internship and already been given a task that is giving me trouble. I have a dataframe with a "complete_time" and a "cycle_time" column where the cycle column is the run time in %H:%M:%S.%f.
I need to calculate the Start Time with date by subtracting the run time from the end time. I have tried using the datetime lib but that has proven unsuccessful as that wants both columns in datetime format, but the run time doesn't have a date, nor should it.
The cycle column was converted from a decimal string that originally looked like 25.2 (in seconds) using:
df['cycle_time'] = df['Cycle Time'].astype('float64')
df['cycle_time'] = pd.to_datetime(df['cycle_time'],unit='s')
df['cycle_time'] = pd.Series([val.time() for val in df['cycle_time']])
Here is the dataframe:
complete_time cycle_time
0 2018-05-07 17:12:34.220 00:00:25.200000
1 2018-05-07 17:12:37.807 00:00:00
2 2018-05-07 17:12:43.453 00:00:25.200000
3 2018-05-07 17:12:51.193 00:00:25.100000
4 2018-05-07 17:12:52.223 00:00:25.300000
5 2018-05-07 17:12:54.297 00:00:00
6 2018-05-07 17:12:59.430 00:00:25.200000
7 2018-05-07 17:13:03.047 00:00:00
8 2018-05-07 17:13:08.697 00:00:25.200000
9 2018-05-07 17:13:16.417 00:00:25.200000
I want to add the start_time as a new column in the dataframe.
Thanks in advance for any direction.
Upvotes: 1
Views: 481
Reputation: 150735
Use pd.to_timedelta()
instead of pd.to_datetime
:
df['cycle_time'] = pd.to_timedelta(df['Cycle Time'].astype(float), unit='s')
df['start_time'] = df['complete_time'] - df['cycle_time']
Upvotes: 1