Reputation: 1303
I think this is best shown with an example. What I'm trying to do is find the non-null number from a group and propagate it to the rest of the group.
In [52]: df = pd.DataFrame.from_dict({1:{'i_id': 2, 'i_num':1}, 2: {'i_id': 2, 'i_num': np.nan}, 3: {'i_id': 2, 'i_num': np.nan}, 4: {'i_id': 3, 'i_num': np.nan}, 5: {'i_id': 3, 'i_num': 5}}, orient='index')
In [53]: df
Out[53]:
i_num i_id
1 1 2
2 NaN 2
3 NaN 2
4 NaN 3
5 5 3
The DataFrame would look something like this. What I want is to take all the i_id == 2 and make their i_num == 1, and all the i_id == 3, and make their i_num == 5 (so both matching their non-null group neighbors).
So the end result would be this:
i_num i_id
1 1 2
2 1 2
3 1 2
4 5 3
5 5 3
Upvotes: 2
Views: 2920
Reputation: 176770
first
finds the first non-null value in a group. You can fill in the other values in each group like this:
df['i_num'] = df.groupby('i_id')['i_num'].transform('first')
This produces the column as required:
i_num i_id
1 1 2
2 1 2
3 1 2
4 5 3
5 5 3
Bear in mind that this will replace all values in the group with the first value, not just NaN
values (this seems to be what you're looking for here though).
Alternatively - and to respect any other non-null values in the group - you can use fillna
in the following way:
# make a column of first values for each group
x = df['i_id'].map(df.groupby('i_id')['i_num'].first())
# fill only NaN values using new column x
df['i_num'] = df['i_num'].fillna(x)
Upvotes: 5