Reputation: 473
I have the following df
id1 id2 uid . . .
1 100 na
1 101 1000
1 101 1000
2 102 1001
2 103 na
3 104 1002
3 104 1002
3 104 1002
3 104 na
3 105 na
3 106 na
I want to group by id1
and fill uid
I tried doing a min/max fill using the following snippet:
`df['uid'] = df.groupby(['id1']).transform(lambda x: x.fillna(x.max()))`
Fill na min or max should work either way because the each id1 should have a distinct uid, however its returning different values so it may be something in the way im filling values
desired output:
id1 id2 uid . . .
1 100 1000
1 101 1000
1 101 1000
2 102 1001
2 103 1001
3 104 1002
3 104 1002
3 104 1002
3 104 1002
3 105 1002
3 106 1002
Upvotes: 0
Views: 69
Reputation: 7627
Try to use x.ffill().bfill()
:
df = pd.DataFrame(
{'id1': [1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3], 'id2': [100, 101, 101, 102, 103, 104, 104, 104, 104, 105, 106],
'uid': [np.nan, 1000, 1000, 1001, np.nan, 1002, 1002, 1002, np.nan, np.nan, np.nan]}, dtype='object')
df['uid'] = df.groupby(['id1']).uid.transform(lambda x: x.ffill().bfill()).astype(int)
print(df)
Prints:
id1 id2 uid
0 1 100 1000
1 1 101 1000
2 1 101 1000
3 2 102 1001
4 2 103 1001
5 3 104 1002
6 3 104 1002
7 3 104 1002
8 3 104 1002
9 3 105 1002
10 3 106 1002
Upvotes: 1