Michael Y
Michael Y

Reputation: 25

Pandas Ordered Categorical not working as intended

df = pd.DataFrame(['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D'],
                  index=['excellent', 'excellent', 'excellent', 'good', 'good', 'good', 'ok', 'ok', 'ok', 'poor', 'poor'])
df.rename(columns={0: 'Grades'}, inplace=True)

cat_dtype = pd.CategoricalDtype(categories=['D', 'D+', 'C-', 'C', 'C+', 'B-', 'B', 'B+', 'A-', 'A', 'A+'], ordered=True)

print(df['Grades'].astype(cat_dtype))

print(df['Grades'] > 'C')

When I checked the cat_dtype object, the order is clearly correct when 'A+' being the largest and 'D' being the smallest. When I compare the values under column 'Grades' with 'C', the result doesn't match the order. What gives?

Output

Upvotes: 0

Views: 283

Answers (1)

footfalcon
footfalcon

Reputation: 621

Looks like you just need to assign the 'astype' to df['Grades']

df['Grades'] = df['Grades'].astype(cat_dtype)

Filtering on your criteria outputs as:

df[df['Grades'] > 'C']


            Grades
excellent   A+
excellent   A
excellent   A-
good        B+
good        B
good        B-
ok          C+

Upvotes: 3

Related Questions