Reputation: 415
I have lists that consist of mean absolute errors and classification accuracy of different models.
print(MAE_linear)
print(MAE_1vR)
print(MAE_multi)
print(MAE_ordinal)
print(acc_linear)
print(acc_1vR)
print(acc_multi)
print(acc_ordinal)
I would like to create comparison boxplots from these list of floats:
[0.4937882 0.50900745 0.49471506 0.5159206 0.6519391 ]
[1.0031989 0.45563284 0.4681502 0.67197496 0.68400556]
[1.1409596 1.106676 0.4766342 0.86363006 0.6922114 ]
[0.28936023 0.4942281 0.25841448 0.4128651 0.76453406]
[0.58038944 0.5312239 0.50368565 0.581363 0.35688457]
[0.2623783 0.5684979 0.5581363 0.34116828 0.3705146 ]
[0.19394994 0.24297635 0.55445063 0.24534075 0.3664812 ]
[0.71265644 0.5223922 0.74165505 0.59805286 0.35688457]
%matplotlib inline
#import seaborn as sns
#import matplotlib.pyplot as plt
fig, axs = plt.subplots(ncols=2, figsize=[10, 6])
fig.suptitle('Ordinal regression predicts NBR\n\n',
color='dimgrey',
size=22)
axs[0] = nuss_style_fun(ax=axs[0], title='\n\nMagnitude prediction')
sns.boxplot(y=['Linear Regression', 'Logistic Regression\n(one versus rest)', 'Logistic regression\n(multinomial)', 'Ordered logistic regression'],
x=[MAE_linear, MAE_1vR, MAE_multi, MAE_ordinal], ax=axs[0])
axs[0].set(xlabel='Mean absolute error (lower is better)',
ylabel=' ')
axs[1] = nuss_style_fun(ax=axs[1], title='\n\nCategory prediction')
sns.boxplot(y=['Linear Regression', 'Logistic Regression\n(one versus rest)', 'Logistic regression\n(multinomial)', 'Ordered logistic regression'],
x=[acc_linear, acc_1vR, acc_multi, acc_ordinal], ax=axs[1])
axs[1].set(xlabel='Classification accuracy (higher is better)',
ylabel=' ',
xlim=[0, 1])
axs[1].get_yaxis().set_ticks([])
#author line
fig.text(0.99, 0.01, '@rikunert', color='grey', style='italic',
horizontalalignment='right')
fig.tight_layout()
But it is giving following error:
TypeError: Neither the `x` nor `y` variable appears to be numeric.
I would appreciate if someone could help to understand and solve the problem.
Upvotes: 0
Views: 431
Reputation: 80299
You can create a temporary dataframe to obtain a wide-form input. For such wide-form data, orient='h'
creates horizontal boxplots.
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
MAE_linear = [0.4937882, 0.50900745, 0.49471506, 0.5159206, 0.6519391]
MAE_1vR = [1.0031989, 0.45563284, 0.4681502, 0.67197496, 0.68400556]
MAE_multi = [1.1409596, 1.106676, 0.4766342, 0.86363006, 0.6922114]
MAE_ordinal = [0.28936023, 0.4942281, 0.25841448, 0.4128651, 0.76453406]
acc_linear = [0.58038944, 0.5312239, 0.50368565, 0.581363, 0.35688457]
acc_1vR = [0.2623783, 0.5684979, 0.5581363, 0.34116828, 0.3705146]
acc_multi = [0.19394994, 0.24297635, 0.55445063, 0.24534075, 0.3664812]
acc_ordinal = [0.71265644, 0.5223922, 0.74165505, 0.59805286, 0.35688457]
fig, axs = plt.subplots(ncols=2, figsize=[10, 6], sharey=True)
fig.suptitle('Ordinal regression predicts NBR', color='dimgrey')
sns.boxplot(data=pd.DataFrame({'Linear Regression': MAE_linear,
'Logistic Regression\n(one versus rest)': MAE_1vR,
'Logistic regression\n(multinomial)': MAE_multi,
'Ordered logistic regression': MAE_ordinal}),
orient='h',
ax=axs[0])
axs[0].set_title('Magnitude prediction')
axs[0].set_xlabel('Mean absolute error (lower is better)')
axs[0].set_ylabel('')
axs[0].tick_params(axis='y', length=0) # hide the tick marks
sns.boxplot(data=pd.DataFrame({'Linear Regression': acc_linear,
'Logistic Regression\n(one versus rest)': acc_1vR,
'Logistic regression\n(multinomial)': acc_multi,
'Ordered logistic regression': acc_ordinal}),
orient='h',
ax=axs[1])
axs[1].set_title('\n\nCategory prediction')
axs[1].set_xlabel('Classification accuracy (higher is better)')
axs[1].set_ylabel('')
axs[1].set_xlim(0, 1)
axs[1].tick_params(axis='y', length=0)
fig.tight_layout()
plt.show()
Upvotes: 1