Reputation: 497
I have the following Pandas DataFrame
which I use comparing the performance of different classifiers over multiple iterations. After each iteration, I save the ranking of that specific classifier to a DataFrame
which is the cumulative sum of rankings over all iterations (the index of the DataFrame
tells the ranking from 0-3, i.e., 4 classifiers in total and 0 is the best).
The DataFrame
looks as follows:
rankings = {'Classifier1': ['1', '2', '1', '0'],
'Classifier2': ['2', '1', '1', '0'],
'Classifier3': ['0', '1', '1', '2'],
'Classifier4': ['1', '0', '1', '2']}
df = pd.DataFrame(data = rankings)
which formats as:
Classifier1 Classifier2 Classifier3 Classifier4
0 1 2 0 1
1 2 1 1 0
2 1 1 1 1
3 0 0 2 2
I would like to create the following boxplot (as in this paper) of the different classifier by using Seaborn or alternative method:
Upvotes: 1
Views: 531
Reputation: 10359
Firstly, we need to convert your data into numeric values rather than strings. Then, we melt the dataframe to get it into long format, and finally we apply a boxplot with a swarmplot on top
df = df.apply(pd.to_numeric).melt(var_name='Classifier', value_name='AUC Rank')
ax = sns.boxplot(data=df, x='Classifier', y='AUC Rank')
ax = sns.swarmplot(data=df, x='Classifier', y='AUC Rank', color='black')
Upvotes: 1