I.M.
I.M.

Reputation: 344

Manipulating values not matching HH:MM:SS format - python

I have a dataframe that looks like the following:

    arrival     departure
0   23:55:00    23:57:00
1   23:57:00    23:59:00
2   23:59:00    24:01:00
3   24:01:00    24:03:00
4   24:03:00    24:05:00

I am working with data that cover a whole day and part of the day after. Data are (most of the time) in the HH:MM:SS format. However some time values are higher than 23:59:59 and go up to 27:00:00.

I would like to get the time difference between departure and arrival columns.

I tried using datetime to do that but I guess something went wrong:

FMT = '%H:%M:%S'

delta = datetime.strptime(df['departure'], FMT) - datetime.strptime(df['arrival'], FMT)

Which raises the following error:

ValueError: time data '24:01:00' does not match format '%H:%M:%S'

Is there a way to get the time difference between these two columns even though their format do not always match the HH:MM:SS format?

Upvotes: 1

Views: 213

Answers (1)

wiesson
wiesson

Reputation: 6822

You could use timedelta from datetime

import datetime

delta1 = datetime.timedelta(hours=23, minutes=59, seconds=0)
delta2 = datetime.timedelta(hours=24, minutes=01, seconds=0)
timedelta = delta2 - delta1
>>> timedelta # or timedelta.to_seconds()
datetime.timedelta(seconds=120)

Give you the delta in seconds. Full example:

import datetime

arrival = "24:01:00"
departure = "24:03:00"


def get_time_from_string(t):
    return dict(
        zip(["hours", "minutes", "seconds"], list(map(lambda x: int(x), t.split(":"))),)
    )


delta1 = datetime.timedelta(**get_time_from_string(arrival))
delta2 = datetime.timedelta(**get_time_from_string(departure))

delta = delta2 - delta1
print(delta.total_seconds())

Upvotes: 2

Related Questions