Reputation: 563
I'd like to plot an Histogram which makes comparisons between two arrays of data. Basically, i want to make exactly this:
Suppose i want to make this plot, but using two arrays with four entries, one with the numbers which should go to the blue areas, and the other with the ones for the blue areas. I have tried this:
x1 = np.array([0.1,0.2,0.3])
x2 = np.array([0.1,0.2,0.5])
sns.histplot(data=[x1,x2], x=['1','2','3'], multiple="dodge", hue=['a','b'], shrink=.8)
But it gives me the error “ValueError: arrays must all be same length”
I know that i'm supposed to enter a df and not arrays, but sadly i'm not really an expert on how to use them. How can i solve this problem? Simply put, i'm looking for a copy and paste solution here, in which i can then change the numbers, and the name of the columns.
Upvotes: 0
Views: 1307
Reputation: 80279
It looks like you want a barplot, not a histogram. Creating a seaborn plot from multiple columns usually involves converting them to "long form", making the process less straightforward.
Here is an example:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
x1 = np.array([0.1, 0.2, 0.3])
x2 = np.array([0.1, 0.2, 0.5])
x = ['1', '2', '3'] # or, simpler, x = np.arange(len(x1)) + 1
df = pd.DataFrame({'a': x1, 'b': x2, 'x': x})
df_long = df.melt('x')
ax = sns.barplot(data=df_long, x='x', y='value', dodge=True, hue='variable')
plt.show()
The long form looks like:
x variable value
0 1 a 0.1
1 2 a 0.2
2 3 a 0.3
3 1 b 0.1
4 2 b 0.2
5 3 b 0.5
See pandas' melt
for additional options, such as naming the created columns.
Upvotes: 2