David Lihtalia
David Lihtalia

Reputation: 21

Replace NaN values in entire dataframe with average of others

I have several DataFrames (DataFrames have the same index and column structure). The problem is that there are NaN values in these dataframes.

I want to replace these NaN values by mean value of other's DataFrames' corresponding values.

For exapmle let's look at 3 dataframes.

DataFrame1 with 1:M2 NaN :

   M1   M2    M3        
0  1    1     2  
1  8   NaN    9 
2  4    2     7       
3  9    6     3  

DataFrame 2 with NaN value at 0:M3:

   M1   M2    M3        
0  2    3    NaN  
1  1    1     6 
2  1    2     9       
3  4    6     2

DataFrame3:

   M1   M2    M3        
0  1    4     2  
1  2    9     1 
2  1    6     5       
3  1    NaN     4

So we replace NaN in first DataFrame by 5 (9+1)/2. Second NaN should be replaced by 2 because (2+2)/2, third by 6 and so on.

Is there any good and elegant way to do it?

Upvotes: 1

Views: 106

Answers (2)

jpp
jpp

Reputation: 164623

This is one way using numpy.nanmean.

avg = np.nanmean([df1.values, df2.values, df3.values], axis=0)

for df in [df1, df2, df3]:
    df[df.isnull()] = avg
    df = df.astype(int)

Note: since np.nan is float, we convert explicitly back to int.

Upvotes: 1

BENY
BENY

Reputation: 323226

We can concat , then using groupby fillna , after split should get what you need

s=pd.concat([df1,df2,df3],keys=[1,2,3])
s=s.groupby(level=1).apply(lambda x : x.fillna(x.mean()))
df1,df2,df3=[x.reset_index(level=0,drop=True) for _,x in s.groupby(level=0)]
df1
Out[1737]: 
   M1   M2   M3
0   1  1.0  2.0
1   8  5.0  9.0
2   4  2.0  7.0
3   9  6.0  3.0

Upvotes: 1

Related Questions