mdawg
mdawg

Reputation: 49

Categorical Grade Data (A+, B-, etc) into numerical values

I have a DataFrame where one column is grade data. It spans from A+, A, A- etc. all the way down to F. These are in the form categories. I want to convert them efficiently into numbers, such that the best grade gets the highest number. Since there are 13 grades, A+ should get the value of 13 and F should get the value of 1.

For instance (but with categories instead of strings):

grades = ['A+', 'C-', 'F', 'B', 'D-']
students = ['billy', 'bob', 'joe', 'tom', 'jamal']

pd.DataFrame(columns = ['grades'], data = grades, index = students )

I would like to turn the grades1 column of this DataFrame into numeric values ranging from 1 to 13, corresponding to the categories of F and A+ respectively. I'm not really sure how to go about this. A

EDIT: also this is is multiindex dataframe. The first index is the date, the second is the name, then the value.

Upvotes: 1

Views: 856

Answers (2)

cs95
cs95

Reputation: 402573

Most of your problems go away once you declare these values as Categorical items.

s = pd.Series(['C+', 'A+', 'D+', 'D', 'D', 'A+', 'C', 'D+', 'C+', 'A+', 'A-', 'F',
       'B', 'D+', 'D-', 'A+', 'A+', 'D-', 'A', 'B-'])

cats = 'A+ A A- B+ B B- C+ C C- D+ D D- F'.split()[::-1]
s = pd.Categorical(s, categories=cats, ordered=True)

s.codes + 1
array([ 7, 13,  4,  3,  3, 13,  6,  4,  7, 13, 11,  1,  9,  4,  2, 13, 13,
        2, 12,  8], dtype=int8)

Upvotes: 2

abarnert
abarnert

Reputation: 365767

What you probably want to do is build a dict, mapping each letter grade to a value.

You can do this explicitly:

gradevalues = {'A+': 13, 'A': 12, …, 'F': 1}

But it's probably better to do it programmatically, because less repetition means fewer places to make a typo:

grades = 'A+ A A- B+ B B- C+ C C- D+ D D- F'.split()
grades.reverse()
gradevalues = {grade: i for i, grade in enumerate(grades, 1)}
assert gradevalues['F'] == 1
assert gradevalues['A+'] == 13

Upvotes: 3

Related Questions