pang2016
pang2016

Reputation: 539

fillna by other dataframe row

I have a problem:

import pandas
df1=pandas.DataFrame([['2017-1-22',25,None],['2017-1-23','',''],['2017-1-24',24,15]],columns=['date','high_tem','low_tem'])
df2=pandas.DataFrame([['2017-1-22',22,18],['2017-1-23',23,''],['2017-1-24',20,10]],columns=['date','high_tem','low_tem'])
df3=pandas.DataFrame([['2017-1-22',25,16],['2017-1-23',24,18],['2017-1-24',22,11]],columns=['date','high_tem','low_tem'])

the df1,df2,df3 like this:

        date high_tem low_tem
0  2017-1-22       25    None
1  2017-1-23                 
2  2017-1-24       24      15
        date  high_tem low_tem
0  2017-1-22        22      18
1  2017-1-23        23        
2  2017-1-24        20      10
        date  high_tem  low_tem
0  2017-1-22        25       16
1  2017-1-23        24       18
2  2017-1-24        22       11

I try get the result like this:

         date  high_tem  low_tem
0  2017-1-22        22       18
1  2017-1-23        24       18
2  2017-1-24        24       15

the None is in df1 first rows(the index=0). So I use the df2 first row(which have the same date with df1 first rows) to replace it.

the Null string is in df1 second row(the index=1), the df2 second have null string.So I use the df3 second row to replce it.

the df1 thrid row doesn't contain None and Null string, I don't change this row.

Thanks

Upvotes: 1

Views: 498

Answers (1)

Nickil Maveli
Nickil Maveli

Reputation: 29729

You could loop through each dataframe and replace None and empty strings with np.NaN. Whenever there are presence of any NaNs, fill the complete row with it.

for df in [df1, df2, df3]:
    df.replace({None: np.NaN, "": np.NaN}, inplace=True)
    df.loc[df.isnull().any(axis=1), ['high_tem', 'low_tem']] = np.NaN

Following this, df1, df2 and df3 would be modified accordingly.

Use DF.combine_first() in df1→df2→df3 order to fill the missing values.

df1.combine_first(df2).combine_first(df3)

enter image description here

Upvotes: 1

Related Questions