Reputation: 25
df = pd.DataFrame(['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D'],
index=['excellent', 'excellent', 'excellent', 'good', 'good', 'good', 'ok', 'ok', 'ok', 'poor', 'poor'])
df.rename(columns={0: 'Grades'}, inplace=True)
cat_dtype = pd.CategoricalDtype(categories=['D', 'D+', 'C-', 'C', 'C+', 'B-', 'B', 'B+', 'A-', 'A', 'A+'], ordered=True)
print(df['Grades'].astype(cat_dtype))
print(df['Grades'] > 'C')
When I checked the cat_dtype object, the order is clearly correct when 'A+' being the largest and 'D' being the smallest. When I compare the values under column 'Grades' with 'C', the result doesn't match the order. What gives?
Upvotes: 0
Views: 283
Reputation: 621
Looks like you just need to assign the 'astype' to df['Grades']
df['Grades'] = df['Grades'].astype(cat_dtype)
Filtering on your criteria outputs as:
df[df['Grades'] > 'C']
Grades
excellent A+
excellent A
excellent A-
good B+
good B
good B-
ok C+
Upvotes: 3