Alfred
Alfred

Reputation: 563

Histogram with Seaborn

I'd like to plot an Histogram which makes comparisons between two arrays of data. Basically, i want to make exactly this:

The plot. i want to make

Suppose i want to make this plot, but using two arrays with four entries, one with the numbers which should go to the blue areas, and the other with the ones for the blue areas. I have tried this:

x1 = np.array([0.1,0.2,0.3])
x2 = np.array([0.1,0.2,0.5])
sns.histplot(data=[x1,x2], x=['1','2','3'], multiple="dodge", hue=['a','b'], shrink=.8)

But it gives me the error “ValueError: arrays must all be same length”

I know that i'm supposed to enter a df and not arrays, but sadly i'm not really an expert on how to use them. How can i solve this problem? Simply put, i'm looking for a copy and paste solution here, in which i can then change the numbers, and the name of the columns.

Upvotes: 0

Views: 1307

Answers (1)

JohanC
JohanC

Reputation: 80279

It looks like you want a barplot, not a histogram. Creating a seaborn plot from multiple columns usually involves converting them to "long form", making the process less straightforward.

Here is an example:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

x1 = np.array([0.1, 0.2, 0.3])
x2 = np.array([0.1, 0.2, 0.5])
x = ['1', '2', '3'] # or, simpler, x = np.arange(len(x1)) + 1
df = pd.DataFrame({'a': x1, 'b': x2, 'x': x})
df_long = df.melt('x')
ax = sns.barplot(data=df_long, x='x', y='value', dodge=True, hue='variable')
plt.show()

sns.barplot from arrays

The long form looks like:

   x variable  value
0  1        a    0.1
1  2        a    0.2
2  3        a    0.3
3  1        b    0.1
4  2        b    0.2
5  3        b    0.5

See pandas' melt for additional options, such as naming the created columns.

Upvotes: 2

Related Questions