Reputation: 109
I have a scatter plot, but a lot of the time the values can be right in the same spot, I have used colour and alpha to try remedy the situation. However as you can see it's still hard to distinguish what exactly is plotted in some areas.
Is there a more fool-proof way to solve this?
Thanks
Upvotes: 1
Views: 7923
Reputation: 305
If you would rather have a deterministic offset, I made this function in order to solve a similar problem (which landed me here for an answer). Note that this function only works for exactly overlapping points. However, you can most likely round off your points and slightly modify this function to accommodate "close enough" points.
Hopefully this helps.
import numpy as np
def dodge_points(points, component_index, offset):
"""Dodge every point by a multiplicative offset (multiplier is based on frequency of appearance)
Args:
points (array-like (2D)): Array containing the points
component_index (int): Index / column on which the offset will be applied
offset (float): Offset amount. Effective offset for each point is `index of appearance` * offset
Returns:
array-like (2D): Dodged points
"""
# Extract uniques points so we can map an offset for each
uniques, inv, counts = np.unique(
points, return_inverse=True, return_counts=True, axis=0
)
for i, num_identical in enumerate(counts):
# Prepare dodge values
dodge_values = np.array([offset * i for i in range(num_identical)])
# Find where the dodge values must be applied, in order
points_loc = np.where(inv == i)[0]
#Apply the dodge values
points[points_loc, component_index] += dodge_values
return points
Here is an example of before and after.
Before:
After:
This method only works for EXACTLY overlapping points (or if you are willing to round points off in a way that np.unique
finds matching points).
Upvotes: 1
Reputation: 1242
You can jitter the values (add a bit of random noise) so they won't be exactly on the same spot.
import numpy as np
import matplotlib.pyplot as plt
x = np.random.randint(low=1,high=5,size=50)
y = np.random.randint(low=0,high=2,size=50)
jittered_y = y + 0.1 * np.random.rand(len(y)) -0.05
jittered_x = x + 0.1 * np.random.rand(len(x)) -0.05
plt.figure(figsize=(10,5))
plt.subplot(221)
plt.scatter(x,y,s=10,alpha=0.5)
plt.title('No Jitter')
plt.subplot(222)
plt.scatter(x,jittered_y,s=10,alpha=0.5)
plt.title('Y Jittered')
plt.subplot(223)
plt.scatter(jittered_x,y,s=10,alpha=0.5)
plt.title('X Jittered')
plt.subplot(224)
plt.scatter(jittered_x,jittered_y,s=10,alpha=0.5)
plt.title('Y and X Jittered')
plt.tight_layout();
Upvotes: 8