Reputation: 615
I have a dataframe containing a one missing value.
exam_id exam
0 1 french
1 2 italian
2 3 chinese
3 4 english
4 3 chinese
5 5 russian
6 1 french
7 NaN russian
8 1 french
9 2 italian
I want to fill in the missing exam_id for russian exam based on existing information. Since exam_id for russian is 5 I would like to have the same value assigned to the missing one.
Upvotes: 1
Views: 207
Reputation: 2544
This approach does not only fill missing values. So beware. However, this would also take care of miscodings (e.g., "french" being coded as 3). Building a dictionary for the languages and their values and then applying it via a map will create a new exam_id
column. Do note, however, that if a language doesn't appear in the dictionary (e.g. "French"), it will produce a missing value.
language_test_map = {'french': 1,
'italian': 2,
'chinese': 3,
'english': 4,
'russian': 5}
df['exam_id'] = df['exam'].map(language_test_map)
Upvotes: 1
Reputation: 214927
You can group your data frame by exam
, then do a ffill + bfill
in case there are missing values before and after the existing value:
df.groupby("exam").ffill().bfill()
Upvotes: 3