cheesebread
cheesebread

Reputation: 91

How to plot vertical scatter using only matplotlib

I need to plot vertical scatters in matplotlib but I couldn't find anything in matplotlib.org/examples or StackOverflow.

I tried something of my own but I am missing Jitter. Jitter changes X component slightly for the points having same (or very similar) Y components so they won't overlap. Is there anything which I can use or will I have to change x components manually?

import numpy as np
from matplotlib import pyplot as plt

x = np.array([1,2,3])
l = ['A','B','C']
a = np.array([2,2,3])
b = np.array([3,3,4])
c = np.array([7,7,5])
d = (np.array(a) + np.array(b) + np.array(c)) / 3

plt.subplot(111)
plt.margins(0.2)
plt.xticks(x,l)
plt.plot(x, a, 'ro', label='a')
plt.plot(x, b, 'ro', label='b')
plt.plot(x, c, 'ro', label='c')
plt.plot(x, d, 'k_', markersize=15, label='avg')
plt.tight_layout()
plt.savefig('vertical_scatter')
plt.close()

which gave me following

enter image description here

I found this on Seaborn.

enter image description here

which is what I want but only using matplotlib.

Upvotes: 3

Views: 7646

Answers (2)

Michael H.
Michael H.

Reputation: 3483

Like I mentioned in my comment, you could shift the x-values according to the distance of neighboring y-points. Smaller distances should be mapped to a larger x-shift. This can be done with a logarithm or another function doing that.

import numpy as np
import matplotlib.pyplot as plt

n = 100
y = np.random.random(n)
x = np.ones(n)
x0 = x[0]

y = np.sort(y)
dist = np.diff(y)  # has one entry less than y
dist = np.hstack([dist, np.median(dist)])  # add random value to match shapes
x = np.log(dist)
x = (x - np.min(x)) / (np.max(x) - np.min(x))  # mapped to range from 0 to 1
x = x0 + 0.5*(x - 0.5)  # mapped to range from x0-1/4 to x0+1/4

plt.scatter(x,y)
plt.scatter(x+1,y)
plt.scatter(x+2,y)

plt.show()

enter image description here

Upvotes: 1

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339300

An example with jitter using only matplotlib would be the following. The idea is basically to add some random noise to the x values.

import numpy as np
import matplotlib.pyplot as plt

data = np.random.rayleigh(scale=1, size=(30,4))
labels = list("ABCD")
colors = ["crimson", "purple", "limegreen", "gold"]

width=0.4
fig, ax = plt.subplots()
for i, l in enumerate(labels):
    x = np.ones(data.shape[0])*i + (np.random.rand(data.shape[0])*width-width/2.)
    ax.scatter(x, data[:,i], color=colors[i], s=25)
    mean = data[:,i].mean()
    ax.plot([i-width/2., i+width/2.],[mean,mean], color="k")

ax.set_xticks(range(len(labels)))
ax.set_xticklabels(labels)

plt.show()

enter image description here

Upvotes: 7

Related Questions