Reputation: 39
I am trying to convert data from a csv file to a numeric type so that I can find the greatest and least value in each category. This is a short view of the data I am referencing:
Course | Grades_Recieved |
---|---|
098321 | A,B,D |
324323 | C,B,D,F |
213323 | A,B,D,F |
I am trying to convert the grades_received to numeric types so that I can create new categories that list the highest grade received and the lowest grade received in each course.
This is my code so far:
import pandas as pd
df = pd.read_csv('grades.csv')
df.astype({Grades_Recieved':'int64'}).dtypes`
I have tried the code above, I have tried using to_numeric, but I keep getting an error: invalid literal for int() with base 10: 'A,B,D' and I am not sure how to fix this. I have also tried getting rid of the ',' but the error remains the same.
Upvotes: 0
Views: 74
Reputation: 1441
You can't convert a list of non-numeric strings into int/float, but you can get the desired result doing something like this:
df['Highest_Grade'] = df['Grades_Recieved'].str.split(',').apply(lambda x: min(x))
df['Lowest_Grade'] = df['Grades_Recieved'].str.split(',').apply(lambda x: max(x))
Upvotes: 2