ArchieTiger
ArchieTiger

Reputation: 2243

Fill nulls in columns with non-null values from other columns

Given a dataframe with similar columns having null values in between. How to dynamically fill nulls in the columns with non-null values from other columns without explicitly stating the names of other column names e.g. select first column category1 and fill the null rows with values from other columns of same rows?

data = {'year': [2010, 2011, 2012, 2013, 2014, 2015, 2016,2017, 2018, 2019],
        'category1': [None, 21, None, 10, None, 30, 31,45, 23, 56],
        'category2': [10, 21, 20, 10, None, 30, None,45, 23, 56],
        'category3': [10, 21, 20, 10, None, 30, 31,45, 23, 56],}


df = pd.DataFrame(data)
df = df.set_index('year')
df

    category1   category2   category3
year            
2010    NaN 10  10
2011    21  21  21
2012    NaN 20  20
2013    10  10  10
2014    NaN NaN NaN
2015    30  30  NaN
2016    31  NaN 31
2017    45  45  45
2018    23  23  23
2019    56  56  56

After filling category1:

category1   category2   category3
year            
2010    10  10  10
2011    21  21  21
2012    20  20  20
2013    10  10  10
2014    NaN NaN NaN
2015    30  30  NaN
2016    31  NaN 31
2017    45  45  45
2018    23  23  23
2019    56  56  56

Upvotes: 3

Views: 1405

Answers (3)

Merlin
Merlin

Reputation: 25629

Try This:

df['category1']= df['category1'].fillna(df.median(axis=1))

Upvotes: 0

jezrael
jezrael

Reputation: 862511

You can use first_valid_index with condition if all values are NaN:

def f(x):
    if x.first_valid_index() is None:
        return None
    else:
        return x[x.first_valid_index()]

df['a'] = df.apply(f, axis=1)

print (df)
      category1  category2  category3     a
year                                       
2010        NaN       10.0       10.0  10.0
2011       21.0       21.0       21.0  21.0
2012        NaN       20.0       20.0  20.0
2013       10.0       10.0       10.0  10.0
2014        NaN        NaN        NaN   NaN
2015       30.0       30.0       30.0  30.0
2016       31.0        NaN       31.0  31.0
2017       45.0       45.0       45.0  45.0
2018       23.0       23.0       23.0  23.0
2019       56.0       56.0       56.0  56.0

Upvotes: 1

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

IIUC you can do it this way:

In [369]: df['category1'] = df['category1'].fillna(df['category2'])

In [370]: df
Out[370]:
      category1  category2  category3
year
2010       10.0       10.0       10.0
2011       21.0       21.0       21.0
2012       20.0       20.0       20.0
2013       10.0       10.0       10.0
2014        NaN        NaN        NaN
2015       30.0       30.0       30.0
2016       31.0        NaN       31.0
2017       45.0       45.0       45.0
2018       23.0       23.0       23.0
2019       56.0       56.0       56.0

Upvotes: 1

Related Questions