kiesel
kiesel

Reputation: 151

Add time from column to date-column on condition from third column

I have a dataframe df that contains (among others) three columns: day, timeTo and goesOverDateBoundary.

day is a date-variable while timeTo is a string in the format %H:%M (ex: 18:00). goesOverDateBoundary is a boolean.

I'd like to "add" timeTo to day where goesOverDateBoundary is False and I'd like to add timeTo to day + 1 where goesOverDateBoundary is True.

With "add" I mean that day 31-12-2020 and timeTo 18:00 becomes a datetime object 31-12-2010 18:00. Likewise if goesOverDateBoundary is True the resulting object should be 01-01-2021 18:00.

I considered iterating over the rows in my dataframe but it says here that you should "never" do that.

Example data:

+------------+--------+----------------------+------------------+
|    day     | timeTo | goesOverDateBoundary |   resultIdLike   |
+------------+--------+----------------------+------------------+
| 31-12-2020 | 18:00  |                    0 | 31-12-2020 18:00 |
| 31-12-2020 | 18:00  |                    1 | 01-01-2021 18:00 |
| 10-09-2020 | 03:00  |                    0 | 10-09-2020 03:00 |
| 10-09-2020 | 03:00  |                    1 | 11-09-2020 03:00 |
+------------+--------+----------------------+------------------+

Upvotes: 3

Views: 61

Answers (1)

jezrael
jezrael

Reputation: 862581

Convert column to datetimes by to_datetime and times to timedeltas by to_timedelta, sum and in Series.mask add 1 day with mask by convert column goesOverDateBoundary to boolean for 1->True and 0->False mapping:

df['day'] = pd.to_datetime(df['day'], dayfirst=True)
df['timeTo'] = pd.to_timedelta(df['timeTo'] + ':00')

s = df['day'] + df['timeTo']
df['new'] = s.mask(df['goesOverDateBoundary'].astype(bool), s + pd.Timedelta(1, unit='d'))

print (df)
         day          timeTo  goesOverDateBoundary      resultIdLike  \
0 2020-12-31 0 days 18:00:00                     0  31-12-2020 18:00   
1 2020-12-31 0 days 18:00:00                     1  01-01-2021 18:00   
2 2020-09-10 0 days 03:00:00                     0  10-09-2020 03:00   
3 2020-09-10 0 days 03:00:00                     1  11-09-2020 03:00   

                  new  
0 2020-12-31 18:00:00  
1 2021-01-01 18:00:00  
2 2020-09-10 03:00:00  
3 2020-09-11 03:00:00  

Alternative if need original timeTo column:

df['day'] = pd.to_datetime(df['day'], dayfirst=True)
td = pd.to_timedelta(df['timeTo'] + ':00')

s = df['day'] + td
df['new'] = s.mask(df['goesOverDateBoundary'].astype(bool), s + pd.Timedelta(1, unit='d'))

print (df)
         day timeTo  goesOverDateBoundary      resultIdLike  \
0 2020-12-31  18:00                     0  31-12-2020 18:00   
1 2020-12-31  18:00                     1  01-01-2021 18:00   
2 2020-09-10  03:00                     0  10-09-2020 03:00   
3 2020-09-10  03:00                     1  11-09-2020 03:00   

                  new  
0 2020-12-31 18:00:00  
1 2021-01-01 18:00:00  
2 2020-09-10 03:00:00  
3 2020-09-11 03:00:00  

Upvotes: 2

Related Questions