Reputation: 943
I have a very large dataframe that looks like the following:
fraction id
0 0.729797 0
1 0.141084 1
2 0.226900 2
3 0.960937 3
4 0.452835 4
5 NaN 1
6 0.352142 2
7 0.104814 3
8 0.345633 4
9 0.498004 1
10 0.131665 2
11 NaN 3
12 0.886092 4
13 0.839767 1
14 0.257997 2
15 0.526350 3
Currently I am just filling NaN data with 0s using the following line of code:
df.fillna(0,inplace=True)
Is there a way to fill all NaN data using the prior "fraction" value using corresponding "id"s?
For example, the row at index #5 has a NaN value for "fraction", and has an "id" value of 1. The prior "fraction" value for id #1 was 0.141084.
Is there a way to replace with this value, and do this operation for the entire dataframe?
Thank You
Upvotes: 1
Views: 107
Reputation: 33843
Perform a groupby
on 'id'
and then forward fill with ffill
:
df['fraction'] = df.groupby('id')['fraction'].ffill()
Note that you can do the same process on all columns in your DataFrame at once by omitting the ['fraction']
. In the case of your example data the output is the same:
df = df.groupby('id').ffill()
The resulting output:
fraction id
0 0.729797 0
1 0.141084 1
2 0.226900 2
3 0.960937 3
4 0.452835 4
5 0.141084 1
6 0.352142 2
7 0.104814 3
8 0.345633 4
9 0.498004 1
10 0.131665 2
11 0.104814 3
12 0.886092 4
13 0.839767 1
14 0.257997 2
15 0.526350 3
Upvotes: 2