Debayan
Debayan

Reputation: 854

Replace date fields in pandas with date values from next columns

I have a Dataframe consisting of multiple date fields as follows

df = pd.DataFrame({
    'Date1': ['2017-12-14', '2017-12-14', '2017-12-14', '2017-12-15', '2017-12-14', '2017-12-14', '2017-12-14'],
    'Date2': ['2018-1-17', "NaT","NaT","NaT","NaT","NaT","NaT"],
    'Date3': ['2018-2-15',"NaT","NaT",'2018-4-1','NaT','NaT','2018-4-1'],
    'Date4': ['2018-3-11','2018-4-1','2018-4-1',"NaT",'2018-4-1','2018-4-2',"NaT"]})
df


Date1       Date2       Date3       Date4
2017-12-14  2018-1-17   2018-2-15   2018-3-11
2017-12-14  NaT         NaT         2018-4-1
2017-12-14  NaT         NaT         2018-4-1
2017-12-15  NaT         2018-4-1    NaT
2017-12-14  NaT         NaT         2018-4-1
2017-12-14  NaT         NaT         2018-4-2
2017-12-14  NaT         2018-4-1    NaT
Date1 Date2 Date3 Date4
2017-12-14 2018-1-17 2018-2-15 2018-3-11
2017-12-14 NaT NaT 2018-4-1
2017-12-14 NaT NaT 2018-4-1
2017-12-15 NaT 2018-4-1 NaT
2017-12-14 NaT NaT 2018-4-1
2017-12-14 NaT NaT 2018-4-2
2017-12-14 NaT 2018-4-1 NaT

As you can see there are lots of empty date values which i need to be filled up with dates from the immediate next column.

Expected Output:

Date1 Date2 Date3 Date4
2017-12-14 2018-1-17 2018-2-15 2018-3-11
2017-12-14 2018-4-1 2018-4-1 2018-4-1
2017-12-14 2018-4-1 2018-4-1 2018-4-1
2017-12-15 2018-4-1 2018-4-1 NaT
2017-12-14 2018-4-1 2018-4-1 2018-4-1
2017-12-14 2018-4-2 2018-4-2 2018-4-2
2017-12-14 2018-4-1 2018-4-1 NaT

Please note : the last column can remain NaT

I have tried bfill method in vain :

df.bfill(axis=1)

Upvotes: 1

Views: 625

Answers (1)

jezrael
jezrael

Reputation: 862406

Convert values to datetimes if necessary and then back filling missing values NaT:

df = df.apply(pd.to_datetime).bfill(axis=1)
print (df)
       Date1      Date2      Date3      Date4
0 2017-12-14 2018-01-17 2018-02-15 2018-03-11
1 2017-12-14 2018-04-01 2018-04-01 2018-04-01
2 2017-12-14 2018-04-01 2018-04-01 2018-04-01
3 2017-12-15 2018-04-01 2018-04-01        NaT
4 2017-12-14 2018-04-01 2018-04-01 2018-04-01
5 2017-12-14 2018-04-02 2018-04-02 2018-04-02
6 2017-12-14 2018-04-01 2018-04-01        NaT

If there is multiple columns abd need specify them by list:

cols = ['Date1', 'Date2', 'Date3', 'Date4']
#or columns names with Date text
#cols = df.filter(like='Date').columns
df[cols] = df[cols].apply(pd.to_datetime).bfill(axis=1)

Upvotes: 1

Related Questions