Reputation: 582
I have a very specific problem. I have two numpy arrays and the corresponding element of each array represent a 2d point.
a = [1,2,1,6,1]
b = [5,0,3,1,5]
I want to plot a scatter plot where the size of the marker is based on how many times than point occurs.
That is :
1,5 : 2
2,0 : 1
1,3 : 1
6,1 : 1
So the size array must be size = [2,1,1,1] and other two arrays can be
a = [1,2,1,6]
and b = [5,0,3,1]
So I must be able to call plt.scatter
as follows:
plt.scatter(a,b,s=size)
Upvotes: 4
Views: 5541
Reputation: 339170
Since the question is tagged with numpy, we might use numpy. numpy.unique
allows to calculate the counts of unique values of an array.
import numpy as np
a = [1,2,1,6,1]
b = [5,0,3,1,5]
u, c = np.unique(np.c_[a,b], return_counts=True, axis=0)
then
# u=
[[1 3]
[1 5]
[2 0]
[6 1]]
# c=
[1 2 1 1]
This can be plotted like so, where an additional function may be used to normalize the counts to some point sizes for plotting
import matplotlib.pyplot as plt
s = lambda x : (((x-x.min())/float(x.max()-x.min())+1)*8)**2
plt.scatter(u[:,0],u[:,1],s=s(c))
plt.show()
Upvotes: 3
Reputation: 8917
This will do what you want:
from collections import Counter
a = [1, 2, 1, 6, 1]
b = [5, 0, 3, 1, 5]
counts = Counter([(x, y) for x, y in zip(a, b)])
size = [counts[(x, y)] for x, y in zip(a, b)]
counter
will keep track of how many times each point appears in your arrays. Then size gets that number from counter
.
Note that you actually want size = [2, 1, 1, 1, 2]
because you need s
to be the same size as your input arrays. This won't matter though; you'll just plot the same point twice.
If you really do want to remove the duplicates, you could do the same thing, but add an extra step, where you create a set
of points.
from collections import Counter
a = [1, 2, 1, 6, 1]
b = [5, 0, 3, 1, 5]
counts = Counter([(x, y) for x, y in zip(a, b)])
points = set([(x, y) for x, y in zip(a, b)])
a = list()
b = list()
for x, y in points:
a.append(x)
b.append(y)
size = [counts[(x, y)] for x, y in zip(a, b)]
Upvotes: 1