Reputation: 854
I have a Dataframe consisting of multiple date fields as follows
df = pd.DataFrame({
'Date1': ['2017-12-14', '2017-12-14', '2017-12-14', '2017-12-15', '2017-12-14', '2017-12-14', '2017-12-14'],
'Date2': ['2018-1-17', "NaT","NaT","NaT","NaT","NaT","NaT"],
'Date3': ['2018-2-15',"NaT","NaT",'2018-4-1','NaT','NaT','2018-4-1'],
'Date4': ['2018-3-11','2018-4-1','2018-4-1',"NaT",'2018-4-1','2018-4-2',"NaT"]})
df
Date1 Date2 Date3 Date4
2017-12-14 2018-1-17 2018-2-15 2018-3-11
2017-12-14 NaT NaT 2018-4-1
2017-12-14 NaT NaT 2018-4-1
2017-12-15 NaT 2018-4-1 NaT
2017-12-14 NaT NaT 2018-4-1
2017-12-14 NaT NaT 2018-4-2
2017-12-14 NaT 2018-4-1 NaT
Date1 | Date2 | Date3 | Date4 |
---|---|---|---|
2017-12-14 | 2018-1-17 | 2018-2-15 | 2018-3-11 |
2017-12-14 | NaT | NaT | 2018-4-1 |
2017-12-14 | NaT | NaT | 2018-4-1 |
2017-12-15 | NaT | 2018-4-1 | NaT |
2017-12-14 | NaT | NaT | 2018-4-1 |
2017-12-14 | NaT | NaT | 2018-4-2 |
2017-12-14 | NaT | 2018-4-1 | NaT |
As you can see there are lots of empty date values which i need to be filled up with dates from the immediate next column.
Expected Output:
Date1 | Date2 | Date3 | Date4 |
---|---|---|---|
2017-12-14 | 2018-1-17 | 2018-2-15 | 2018-3-11 |
2017-12-14 | 2018-4-1 | 2018-4-1 | 2018-4-1 |
2017-12-14 | 2018-4-1 | 2018-4-1 | 2018-4-1 |
2017-12-15 | 2018-4-1 | 2018-4-1 | NaT |
2017-12-14 | 2018-4-1 | 2018-4-1 | 2018-4-1 |
2017-12-14 | 2018-4-2 | 2018-4-2 | 2018-4-2 |
2017-12-14 | 2018-4-1 | 2018-4-1 | NaT |
Please note : the last column can remain NaT
I have tried bfill method in vain :
df.bfill(axis=1)
Upvotes: 1
Views: 625
Reputation: 862406
Convert values to datetimes if necessary and then back filling missing values NaT
:
df = df.apply(pd.to_datetime).bfill(axis=1)
print (df)
Date1 Date2 Date3 Date4
0 2017-12-14 2018-01-17 2018-02-15 2018-03-11
1 2017-12-14 2018-04-01 2018-04-01 2018-04-01
2 2017-12-14 2018-04-01 2018-04-01 2018-04-01
3 2017-12-15 2018-04-01 2018-04-01 NaT
4 2017-12-14 2018-04-01 2018-04-01 2018-04-01
5 2017-12-14 2018-04-02 2018-04-02 2018-04-02
6 2017-12-14 2018-04-01 2018-04-01 NaT
If there is multiple columns abd need specify them by list:
cols = ['Date1', 'Date2', 'Date3', 'Date4']
#or columns names with Date text
#cols = df.filter(like='Date').columns
df[cols] = df[cols].apply(pd.to_datetime).bfill(axis=1)
Upvotes: 1