Fateh Muhammad
Fateh Muhammad

Reputation: 21

Replacing nan values in one column with mean of same column ( not Null values of column) where other columns have certain values

Replace the missing values in the variable "s_months" and "incidents" by the respective means of the other ships that share the same type AND the same operation period. Here "s_months" and "incidents" are two columns, having nan values, which we want to fill.

DataFrame named ship

I have tried to find means according to asked conditions. But unable to fill nan values in ship data frame. Here are means calculated, and stored as a data frame.

DataFrame namd shipgroup having means calculated based on cross product of "types" and "o_periods"

Upvotes: 1

Views: 114

Answers (1)

Corralien
Corralien

Reputation: 120559

Use groupby and combine_first to fill NaN:

Minimal Reproducible Example:

>>> df
   types  o_periods  s_months  incidents
0      1          2      63.0        0.0
1      1          2    1095.0        4.0
2      1          2    3353.0       18.0
3      1          2       NaN        NaN
keys = ['types', 'o_periods']
vals = ['s_months', 'incidents']

df[vals] = df[vals].combine_first(df.groupby(keys)[vals].transform('mean'))

Output result:

>>> df
   types  o_periods     s_months  incidents
0      1          2    63.000000   0.000000
1      1          2  1095.000000   4.000000
2      1          2  3353.000000  18.000000
3      1          2  1503.666667   7.333333

Upvotes: 1

Related Questions