1''
1''

Reputation: 27095

Remove occluded scatter plot points

With vector backends (pdf, eps), it's wasteful in terms of file size and rendering time to have points that are completely occluded by other points. How can these be removed?

Upvotes: 2

Views: 354

Answers (1)

armatita
armatita

Reputation: 13465

That is an almost unfair question since that will depend on the marker size vs real coordinates which is difficult to calculate.

In any case perhaps an half solution will do for you. I'm thinking that if you calculate the distance between all points, when a pair is under a given tolerance you only use one of the points (instead of both). This won't be perfect but it might prove useful. A quick test with using this idea (I'm hoping I got the distance logic right):

import matplotlib.pyplot as plt
import scipy

x = np.random.normal(0,1,15000)
y = np.random.normal(0,1,15000)
tol = 0.01

xy = np.hstack((x[:,np.newaxis],y[:,np.newaxis]))
d = scipy.spatial.distance.cdist(xy,xy)
b = np.ones(x.shape,dtype='bool')
for i in range(d.shape[0]-1):
    if d[i,i+1:].min() < tol and b[i]:
        b[i+1+d[i,i+1:].argmin()] = False

x2 = x[b]
y2 = y[b]

f, (ax1, ax2) = plt.subplots(1, 2)

ax1.scatter(x,y,s=90)
ax1.set_xlim(-6,6)
ax1.set_ylim(-6,6)
ax2.scatter(x2,y2,s=90)
ax2.set_xlim(-6,6)
ax2.set_ylim(-6,6)

print('Before: ', x.shape,'\nNow: ',x2.shape)
plt.show()

, gives me this result:

Before:  (15000,) 
Now:  (13004,)

Attempting to remove invisible markers from matplotlib plot

Which represents a savings of about 2000 points in 15000. If you look closely you'll notice that is not perfect but I'm sure a little calibration in the tol argument could improve the plot significantly.

Upvotes: 3

Related Questions