Reputation: 181
Good afternoon all i have a dataframe
dataframe = df
NPI Fa Name
1100
1100 Johnson
1100
5555
2222 Meghan
2222
4444
4444
My goal is to fill in the Missing FA name per the corresponding NPI if it is present
NPI Fa Name
1100 Johnson
1100 Johnson
1100 Johnson
5555
2222 Meghan
2222 Meghan
4444
4444
i know this might be a simple task, however coding wise i am having trouble figuring it out. Please help!
Upvotes: 1
Views: 43
Reputation: 862511
Use GroupBy.apply
with forward and back filling missing values if order is important:
#if necessary
df['Fa Name'] = df['Fa Name'].replace('', np.nan)
df['Fa Name'] = df.groupby('NPI')['Fa Name'].apply(lambda x: x.ffill().bfill())
print (df)
NPI Fa Name
0 1100 Johnson
1 1100 Johnson
2 1100 Johnson
3 5555 NaN
4 2222 Meghan
5 2222 Meghan
6 4444 NaN
7 4444 NaN
If not, fist sorting per both columns and then use GroupBy.ffill
:
df = df.sort_values(['NPI','Fa Name'])
df['Fa Name'] = df.groupby('NPI')['Fa Name'].ffill()
print (df)
NPI Fa Name
1 1100 Johnson
0 1100 Johnson
2 1100 Johnson
4 2222 Meghan
5 2222 Meghan
6 4444 NaN
7 4444 NaN
3 5555 NaN
Upvotes: 2