Jun_1812
Jun_1812

Reputation: 13

Pandas fill in missing data from columns in other rows

I have a df like below:

df = pd.DataFrame({'id': ['a','b','c','d'],
                   'ac' : ['123','223', np.nan, np.nan],
                   'prev' : [np.nan, np.nan, 'a','b']})

Output:

    id  ac   prev
0   a   123  NaN  
1   b   223  NaN  
2   c   NaN  a  
3   d   NaN  b

For ac are null, get prev's value, and then look up at the id column. Fill in the null with value at ac column.

Expected output:

    id  ac   prev
0   a   123  NaN
1   b   223  NaN
2   c   123  a
3   d   223  b

How do I achieve this? Thanks.

Upvotes: 1

Views: 125

Answers (2)

BrunoSE
BrunoSE

Reputation: 94

if I understood correctly you would like to fill the nan values of the 'ac' column with the value given by the 'id' indicated by the corresponding value at the 'prev' column. If so you could try using 'id' as your index, so that you can acces the rows that you are interested in by using .loc

import pandas as pd
import numpy as np

df = pd.DataFrame({'id': ['a', 'b', 'c', 'd'],
                   'ac': ['123', '223', np.nan, np.nan],
                   'prev': [np.nan, np.nan, 'a', 'b']})


df.set_index('id', inplace=True)
to_fill = df.loc[df['ac'].isnull()]
fill_with = df.loc[to_fill['prev'], 'ac'].copy()
fill_with = fill_with.values
df.loc[to_fill.index, 'ac'] = fill_with

Which ends up being:

print(df)
     ac prev
id          
a   123  NaN
b   223  NaN
c   123    a
d   223    b

Note that this code works if your 'id' column has no duplicated values. Hope I helped!

Upvotes: 1

Shubham Sharma
Shubham Sharma

Reputation: 71689

You can use Series.isna to create a boolean mask the use boolean indexing with this mask to map the values of prev column to ac column based on id:

m = df['ac'].isna()
df.loc[m, 'ac'] = df.loc[m, 'prev'].map(df.set_index('id')['ac'])

Result:

  id   ac prev
0  a  123  NaN
1  b  223  NaN
2  c  123    a
3  d  223    b

Upvotes: 2

Related Questions