Reputation: 51
I am trying to perform data cleaning and I encounter difficulty coming up with solutions whereby I want to iterate pandas dataframe so that I can update those rows with the string "Not Specified" under the qualification column.
I wanted to replace the string with "Bachelor's Degree, Post Graduate Diploma, Professional Degree" or "Bachelor's Degree, Post Graduate Diploma, Professional Degree, Master's Degree" respectively depending on whether Masters, Degree column condition indicate 1?
Example: If both Degree column and Masters column has 1 replace it with Bachelor's Degree, Post Graduate Diploma, Professional Degree, Master's Degree else if Degree column has 1 replace it with Bachelor's Degree, Post Graduate Diploma, Professional Degree
How can I possibly achieve this? Below is an attach of the outcome which I hope to achieve.
Upvotes: 0
Views: 709
Reputation: 58
You can define different criteria matching specific conditions and combine them to define a complex condition. Then use .loc to update the dataframe.
I've provided a sample below. It assumes the data is held in a dataframe called df. When the column 'qual' contains 'not specified' and the column 'masters' contains 1, it updates the column 'qual' with 'MyDegree'. You can replace this with anything you want it to be. Create as many conditions as required and use and/or to form complex conditions.
criteria1 = df['qual'] == 'not specified'
criteria2 = df['masters'] == 1
criteria_all = criteria1 & criteria2
df.loc[criteria_all, 'qual'] = 'MyDegree'
Upvotes: 1