Reputation: 621
For reasons I won't get into I need to make violin plots without using a pandas dataframe. For example I have the following ndarray and categories.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
data = np.random.randn(5, 3)
category = np.array(["yes", "no", "no", "no", "yes", "yes","yes", "no", "yes", "yes", "yes", "no", "no", "no", "no"])
ax = sns.violinplot(data = data)
plt.show()
Results in an ungrouped violin plot.
However, I'd like to use the categorical data to make a grouped violin plot
ax = sns.violinplot(data = data, x = category)
plt.show()
Gives an error AttributeError: 'numpy.ndarray' object has no attribute 'get'
. Is there any way around this without pandas?
Upvotes: 2
Views: 1385
Reputation: 16683
data
parameter if using multiple numpy arrays for x
, y
and hue
.y
, you can create an array of indices with np.nonzero
.np.arrays
are one-dimensional with .flatten()
. For example I flatten
your array of random floats from a shape of 5,3
to 15,1
; Otherwise, you will get an error since the arrays have different shapes and Seaborn
doesn't have a way to figure it out as it can with a pandas
dataframe.Likewise, if you pass three (5,3)
arrays to x
, y
and hue
, then Seaborn won't know what to do. So, you must either a) FLATTEN
all arrays and make them equal length of (15,0)
OR b) use a pandas dataframe.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
y = np.random.randn(5, 3)
x = np.nonzero(y)[-1]
y = y.flatten()
hue = np.array(["yes", "no", "no", "no", "yes", "yes","yes", "no", "yes", "yes", "yes", "no", "no", "no", "no"])
sns.violinplot(x=x, y=y, hue=hue)
print(x,'\n\n',y,'\n\n',hue)
[0 1 2 0 1 2 0 1 2 0 1 2 0 1 2]
[-0.28618123 -1.18132595 0.70535902 0.90685532 -1.27258432 0.90417094
3.03506025 0.99796779 0.20247628 0.43226169 0.25005372 -0.9923336
-0.43102785 -0.17117549 -0.16147393]
['yes' 'no' 'no' 'no' 'yes' 'yes' 'yes' 'no' 'yes' 'yes' 'yes' 'no' 'no'
'no' 'no']
Upvotes: 3