Reputation: 551
Given the following dataframe...
Key ID Type Group1 Group2 Group3 Group4 Sex Race
1 A1 Type 1 x x x x Male White
2 A1 Type 2 x x x x
3 A2 Type 1 Male Black
4 A2 Type 2
5 A3 Type 1 x x x x Female White
6 A3 Type 2 x x x x
7 A3 Type 3 x x x x
8 A3 Type 4 x x x x
How can I populate the Sex
and Race
for all rows based on the ID
?
Key ID Type Group1 Group2 Group3 Group4 Sex Race
1 A1 Type 1 x x x x Male White
2 A1 Type 2 x x x x Male White
3 A2 Type 1 Male Black
4 A2 Type 2 Male Black
5 A3 Type 1 x x x x Female White
6 A3 Type 2 x x x x Female White
7 A3 Type 3 x x x x Female White
8 A3 Type 4 x x x x Female White
I know I can use something like df.loc[df['ID'] == A1, 'Sex'].iloc[0]
to get the Sex
for a particular ID
, but not sure how I can have all blanks for Sex
populated based on the Sex
for each ID
.
Upvotes: 1
Views: 212
Reputation: 38425
You can group the data by id and ffill/bfill
df1.replace('', np.nan, inplace = True)
df1['Sex'] = df1.groupby('ID').Sex.apply(lambda x: x.ffill().bfill())
Upvotes: 2