Reputation: 151
I have a dataframe df that contains (among others) three columns: day
, timeTo
and goesOverDateBoundary
.
day
is a date-variable while timeTo
is a string in the format %H:%M (ex: 18:00). goesOverDateBoundary
is a boolean.
I'd like to "add" timeTo
to day
where goesOverDateBoundary
is False
and I'd like to add timeTo
to day
+ 1 where goesOverDateBoundary
is True
.
With "add" I mean that day 31-12-2020
and timeTo 18:00
becomes a datetime object 31-12-2010 18:00
. Likewise if goesOverDateBoundary
is True
the resulting object should be 01-01-2021 18:00
.
I considered iterating over the rows in my dataframe but it says here that you should "never" do that.
Example data:
+------------+--------+----------------------+------------------+
| day | timeTo | goesOverDateBoundary | resultIdLike |
+------------+--------+----------------------+------------------+
| 31-12-2020 | 18:00 | 0 | 31-12-2020 18:00 |
| 31-12-2020 | 18:00 | 1 | 01-01-2021 18:00 |
| 10-09-2020 | 03:00 | 0 | 10-09-2020 03:00 |
| 10-09-2020 | 03:00 | 1 | 11-09-2020 03:00 |
+------------+--------+----------------------+------------------+
Upvotes: 3
Views: 61
Reputation: 862581
Convert column to datetimes by to_datetime
and times to timedeltas by to_timedelta
, sum and in Series.mask
add 1 day with mask by convert column goesOverDateBoundary
to boolean for 1->True
and 0->False
mapping:
df['day'] = pd.to_datetime(df['day'], dayfirst=True)
df['timeTo'] = pd.to_timedelta(df['timeTo'] + ':00')
s = df['day'] + df['timeTo']
df['new'] = s.mask(df['goesOverDateBoundary'].astype(bool), s + pd.Timedelta(1, unit='d'))
print (df)
day timeTo goesOverDateBoundary resultIdLike \
0 2020-12-31 0 days 18:00:00 0 31-12-2020 18:00
1 2020-12-31 0 days 18:00:00 1 01-01-2021 18:00
2 2020-09-10 0 days 03:00:00 0 10-09-2020 03:00
3 2020-09-10 0 days 03:00:00 1 11-09-2020 03:00
new
0 2020-12-31 18:00:00
1 2021-01-01 18:00:00
2 2020-09-10 03:00:00
3 2020-09-11 03:00:00
Alternative if need original timeTo
column:
df['day'] = pd.to_datetime(df['day'], dayfirst=True)
td = pd.to_timedelta(df['timeTo'] + ':00')
s = df['day'] + td
df['new'] = s.mask(df['goesOverDateBoundary'].astype(bool), s + pd.Timedelta(1, unit='d'))
print (df)
day timeTo goesOverDateBoundary resultIdLike \
0 2020-12-31 18:00 0 31-12-2020 18:00
1 2020-12-31 18:00 1 01-01-2021 18:00
2 2020-09-10 03:00 0 10-09-2020 03:00
3 2020-09-10 03:00 1 11-09-2020 03:00
new
0 2020-12-31 18:00:00
1 2021-01-01 18:00:00
2 2020-09-10 03:00:00
3 2020-09-11 03:00:00
Upvotes: 2