Haalanam
Haalanam

Reputation: 89

Pandas - Fill with other column if the field is NaT (Null)

On Pandas Dataframes-

I have a dataframe that looks like this:

ID         date1            date2             date3
 0           NaT              NaT        2019-02-28
 1           NaT       2019-11-10        2019-04-09
 2           NaT              NaT        2019-05-21
 3    2019-12-06       2019-12-05        2019-12-27
 4    2019-11-18       2019-11-18        2019-12-20

I want to add another field in the dataframe in order to have an output like this:

ID        date1          date2            date3          date4
 0          NaT            NaT       2019-02-28     2019-02-28
 1          NaT     2019-11-10       2019-04-09     2019-11-10
 2          NaT            NaT       2019-05-21     2019-05-21
 3   2019-12-06     2019-12-05       2019-12-27     2019-12-06
 4   2019-11-18     2019-11-18       2019-12-20     2019-11-18

In order,

The date4 will be populated by the date3 if date2 and date1 not populated.

The date4 will be populated by the date2 if date2 and date3 populated.

The date4 will be populated by the date1 if date1, date2 and date3 populated.

Do you have an idea on how achieve this ?

I tried the fillna but not really successfull.

Upvotes: 1

Views: 1561

Answers (3)

SeanTomlinson30
SeanTomlinson30

Reputation: 33

I'd personally use a loop so it's clear what's happening, and more maintainable. Here's an example. Using your listed conditions for 'date4' doesn't fill all with an entry.

# required modules
import pandas as pd
import itertools

# create a data frame
d = {'ID': [0, 1, 2, 3, 4], 
     'date1': [None, None, None, '2019-12-06', '2019-11-18'], 
     'date2': [None, '2019-11-10', None, '2019-12-05', '2019-11-18'], 
     'date3': ['2019-02-28', '2019-04-09', '2019-05-21', '2019-12-27', '2019-12-20']}

dat = pd.DataFrame(d)

# make date4
date4 = []
for index, row in dat.iterrows():
    if row['date2'] and row['date1'] == None:
        date4.append(row['date3'])
    elif row['date2'] and row['date3'] != None:
        date4.append(row['date2'])
    elif row['date1'] and row['date2'] and row['date3'] != None:
        date4.append(row['date1'])
    else:
        date4.append(None)

# append to df
dat['date4'] = date4

Upvotes: 1

jeremy_rutman
jeremy_rutman

Reputation: 5738

You can replace NaN values like this:

df['date4']=np.where(df['date1'].isna(),df['date2'],df['date1'])
df['date4']=np.where(df['date4'].isna(),df['date3'],df['date3'])

There are some cases you didnt consider (like what if date1 is not NaN but date2 is) but those all can be dealt with similarly.

Upvotes: 1

jezrael
jezrael

Reputation: 862911

Use back filling missing values and then select first column by positions by DataFrame.iloc:

df['date4'] = df[['date1','date2','date3']].bfill(axis=1).iloc[:, 0]

Or use Series.fillna or Series.combine_first:

df['date4'] = df['date1'].fillna(df['date2']).fillna(df['date3'])
print (df)
   ID      date1      date2      date3      date4
0   0        NaT        NaT 2019-02-28 2019-02-28
1   1        NaT 2019-11-10 2019-04-09 2019-11-10
2   2        NaT        NaT 2019-05-21 2019-05-21
3   3 2019-12-06 2019-12-05 2019-12-27 2019-12-06
4   4 2019-11-18 2019-11-18 2019-12-20 2019-11-18

Upvotes: 3

Related Questions