James Scott
James Scott

Reputation: 181

Pandas/Numpy - Fill Missing Values Per Another Column

Good afternoon all i have a dataframe

dataframe = df

NPI       Fa Name
1100      
1100      Johnson
1100  
5555 
2222      Meghan
2222    
4444
4444

My goal is to fill in the Missing FA name per the corresponding NPI if it is present

    NPI       Fa Name
    1100      Johnson
    1100      Johnson
    1100      Johnson
    5555 
    2222      Meghan
    2222      Meghan
    4444
    4444

i know this might be a simple task, however coding wise i am having trouble figuring it out. Please help!

Upvotes: 1

Views: 43

Answers (1)

jezrael
jezrael

Reputation: 862511

Use GroupBy.apply with forward and back filling missing values if order is important:

#if necessary
df['Fa Name'] = df['Fa Name'].replace('', np.nan)

df['Fa Name'] = df.groupby('NPI')['Fa Name'].apply(lambda x: x.ffill().bfill())
print (df)
    NPI  Fa Name
0  1100  Johnson
1  1100  Johnson
2  1100  Johnson
3  5555      NaN
4  2222   Meghan
5  2222   Meghan
6  4444      NaN
7  4444      NaN

If not, fist sorting per both columns and then use GroupBy.ffill:

df = df.sort_values(['NPI','Fa Name'])

df['Fa Name'] = df.groupby('NPI')['Fa Name'].ffill()
print (df)
    NPI  Fa Name
1  1100  Johnson
0  1100  Johnson
2  1100  Johnson
4  2222   Meghan
5  2222   Meghan
6  4444      NaN
7  4444      NaN
3  5555      NaN

Upvotes: 2

Related Questions