Reputation: 733
I have 2 datasets, one representing Rootzone (mm) and other representing Tree cover (%). I am able to plot these datasets side by side (as shown below). The code used was:
fig = plt.subplots(figsize = (16,7))
ax = [
plt.subplot(121),
plt.subplot(122)]
classified_data.boxplot(grid=False, rot=90, fontsize=10, ax = ax[0])
classified_treecover.boxplot(grid=False, rot=90, fontsize=10, ax = ax[1])
ax[0].set_ylabel('Rootzone Storage Capacity (mm)', fontsize = '12')
ax[1].set_ylabel('Tree Cover (%)', fontsize = '12')
ax[0].set_title('Rootzone Storage Capacity (mm)')
ax[1].set_title('Tree Cover (%)')
But I want to have them in the same plot with both Rootzone (on the left-hand y-axis) and Tree cover (on the right-hand y-axis) as their range is different (using something like twinx()
). But I want them to be stacked together for a single class on the x-axis (something like as shown below with a twin y-axis for the tree cover).
Can someone guide me as to how this can be achieved with my code??
Upvotes: 3
Views: 4373
Reputation: 17794
To plot two datasets with different ranges in the same figure you need to convert all values to corresponding z scores (standardize your data). You can use the hue
parameter in the boxplot()
function in seaborn
to plot two datasets side by side. Consider the following example with 'mpg' dataset.
displacement horsepower origin
0 307.0 130.0 usa
1 350.0 165.0 usa
2 318.0 150.0 usa
3 304.0 150.0 usa
4 302.0 140.0 usa
import seaborn as sns
import matplotlib.pyplot as plt
df = sns.load_dataset('mpg')
df1 = df[['displacement', 'origin']].copy()
df2 = df[['horsepower', 'origin']].copy()
# Convert values to z scores.
df1['z_score'] = df1['displacement'].\
apply(lambda x: (x - df1['displacement'].mean()) / df1['displacement'].std())
df2['z_score'] = df2['horsepower'].\
apply(lambda x: (x - df2['horsepower'].mean()) / df2['horsepower'].std())
df1.drop(['displacement'], axis= 1, inplace=True)
df2.drop(['horsepower'], axis=1, inplace=True)
# Add extra column to use it as the 'hue' parameter.
df1['value'] = 'displacement'
df2['value'] = 'horsepower'
df_cat = pd.concat([df1, df2])
ax = sns.boxplot(x='origin', y='z_score', hue='value', data=df_cat)
plt.yticks([])
ax.set_ylabel('')
# Add the left y axis.
ax1 = ax.twinx()
ax1.set_yticks(np.linspace(df['displacement'].min(), df['displacement'].max(), 5))
ax1.spines['right'].set_position(('axes', -0.2))
ax1.set_ylabel('displacement')
# Add the right y axis.
ax2 = ax.twinx()
ax2.set_yticks(np.linspace(df['horsepower'].min(), df['horsepower'].max(), 5))
ax2.spines['right'].set_position(('axes', 1))
ax2.set_ylabel('horsepower')
plt.show()
Upvotes: 3