Reputation: 83
I'm trying to plot the original data before handling the imbalance in a way to show the class distribution and class imbalance (class is Failure =0/1) 2. I might need to do some transformation on the data in both cases to be able to visualize it.
| failure |
|---------|
| 1 |
| 0 |
| 0 |
| 1 |
| 0 |
import numpy as np
from scipy.stats.kde import gaussian_kde
def distribution_scatter(x, symmetric=True, cmap=None, size=None):
pdf = gaussian_kde(x)
w = np.random.rand(len(x))
if symmetric:
w = w*2-1
pseudo_y = pdf(x) * w
if cmap:
plt.scatter(x, pseudo_y, c=x, cmap=cmap, s=size)
else:
plt.scatter(x, pseudo_y, s=size)
return pseudo_y
I want the plot the distribution of 0's and 1's. For which I believe I need to transform it in someway.
Upvotes: 0
Views: 511
Reputation: 9941
If you want a KDE plot, you can check kdeplot
from seaborn
:
x = np.random.binomial(1, 0.2, 100)
sns.kdeplot(x)
Output:
Update: Or a swarmplot
if you want a scatter:
x = np.random.binomial(1, 0.2, 25)
sns.swarmplot(x=x)
Output:
Update 2: In fact, your function seems to also produce a reasonable visualization:
distribution_scatter(np.random.binomial(1, 0.2, 100))
Output:
Upvotes: 1