Reputation: 124
I'm calculating time differences, in seconds, between busses' expected and actual stop times.
My problem looks like this:
# creating data
d = {
'time_A': ['2022-08-30 06:21:00', '2022-08-30 16:41:00'],
'time_B': ['2022-08-30 06:21:09', '2022-08-30 16:40:16'],
}
# creating DataFrame
my_df = pd.DataFrame(d)
my_df['time_A'] = pd.to_datetime(my_df['time_A'])
my_df['time_B'] = pd.to_datetime(my_df['time_B'])
# subtracting times
my_df['difference'] = my_df['time_B'] - my_df['time_A']
my_df
result:
time_A time_B difference
0 2022-08-30 06:21:00 2022-08-30 06:21:09 0 days 00:00:09
1 2022-08-30 16:41:00 2022-08-30 16:40:16 -1 days +23:59:16
I don't understand why the difference between today 16:40:16 and today 16:41:00 is -1 days +23:59:16.
if I do this
my_df['difference'] = (my_df['time_B'] - my_df['time_A']).dt.seconds
Then I get
time_A time_B difference
0 2022-08-30 06:21:00 2022-08-30 06:21:09 9
1 2022-08-30 16:41:00 2022-08-30 16:40:16 86356
I would like the "difference" cell on row O to display something like "+9", and the one below to display "-44". How do I do this? Thanks!
Upvotes: 0
Views: 36
Reputation: 36450
Subtracting datetime.datetime
s gives datetime.timedelta
s which are represented that way, use .total_seconds()
to get numeric value of seconds, consider following simple example
import datetime
import pandas as pd
df = pd.DataFrame({"schedule":pd.to_datetime(["2000-01-01 12:00:00"]),"actual":pd.to_datetime(["2000-01-01 12:00:05"])})
df['difference_sec'] = (df['schedule'] - df['actual']).apply(datetime.timedelta.total_seconds)
print(df)
output
schedule actual difference_sec
0 2000-01-01 12:00:00 2000-01-01 12:00:05 -5.0
Note that this is feature of datetime.timedelta
, it is not specific to pandas
.
Upvotes: 2