S44
S44

Reputation: 473

Fill nulls based on a group by value

I have the following df

id1 id2  uid . . . 
 1  100   na
 1  101 1000
 1  101 1000
 2  102 1001
 2  103   na
 3  104 1002
 3  104 1002
 3  104 1002
 3  104   na
 3  105   na
 3  106   na

I want to group by id1 and fill uid

I tried doing a min/max fill using the following snippet:

 `df['uid'] = df.groupby(['id1']).transform(lambda x: x.fillna(x.max()))`

Fill na min or max should work either way because the each id1 should have a distinct uid, however its returning different values so it may be something in the way im filling values

desired output:

id1 id2  uid . . . 
 1  100 1000
 1  101 1000
 1  101 1000
 2  102 1001
 2  103 1001
 3  104 1002
 3  104 1002
 3  104 1002
 3  104 1002
 3  105 1002
 3  106 1002

Upvotes: 0

Views: 69

Answers (1)

Алексей Р
Алексей Р

Reputation: 7627

Try to use x.ffill().bfill():

df = pd.DataFrame(
    {'id1': [1, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3], 'id2': [100, 101, 101, 102, 103, 104, 104, 104, 104, 105, 106],
     'uid': [np.nan, 1000, 1000, 1001, np.nan, 1002, 1002, 1002, np.nan, np.nan, np.nan]}, dtype='object')
df['uid'] = df.groupby(['id1']).uid.transform(lambda x: x.ffill().bfill()).astype(int)
print(df)

Prints:

   id1  id2   uid
0    1  100  1000
1    1  101  1000
2    1  101  1000
3    2  102  1001
4    2  103  1001
5    3  104  1002
6    3  104  1002
7    3  104  1002
8    3  104  1002
9    3  105  1002
10   3  106  1002

Upvotes: 1

Related Questions