Reputation: 31
I am new to python and to programming. I am trying to plot down an orientation map using python. I have a large number of points (about 1,200,000) on a plane and each of them belong to a cluster. Each cluster is supposed to be of different color. What I am doing currently is assigning a color to each cluster and drawing a filled circle at each point. I tried to do it in parts by creating plots for different segments and using blend to combine them. This is the code for the part: (sn is the total number of points, label is the cluster array of cluster number and xcoor and ycoor are the coordinates of the point)
pylab.xlim([0,250])
pylab.ylim([0,100])
plt.savefig("HK pickle.png")
for l in range (1, 20):
for j in range(int((float(sn)/80)*(l-1)), int((float(sn)/80)*(l))):
overlay = Image.open("HK pickle.png")
c = label[j] % 8
if c == 0:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0.5, 0, 0))
elif c == 1:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (1, 0, 0))
elif c == 2:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0, 0.5, 0))
elif c == 3:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0, 1, 0))
elif c == 4:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0, 0, 0.5))
elif c == 5:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0, 0 ,1))
elif c == 6:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0.5, 0.5 ,0))
elif c == 7:
circle1 = plt.Circle((float(xcoor[j]), float(ycoor[j])), 0.05, color = (0.5, 0 ,0.5))
fig = plt.gcf()
fig.gca().add_artist(circle1)
del circle1
plt.savefig("HK pick.png")
del fig
back = Image.open("HK pick.png")
comp = Image.blend(back, overlay, 0.5)
comp.save("HK pickle.png", "PNG")
del comp
pylab.xlim([0,250])
pylab.ylim([0,100])
plt.savefig("HK plots.png")
However, this leads to the following error:
fig.gca().add_artist(circle1)
File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 1404, in add_artist
self.artists.append(a)
MemoryError
The error arises at l = 11. I kept checking the task manager in parallel and it still had almost 3GB free memory when the MemoryError showed up. Please help me with this.
I am new to this and still don't know if the information I've given is enough. please let me know if you need any more information
Upvotes: 1
Views: 1642
Reputation: 87416
You might do better with scatter
and the keyword rasterized=True
, which will flatten all of the vector graphics down to a raster image (which will take less memory).
Something like:
colors_lst = [ ... your tuples ...]
color = map(lambda x: colors_lst[x % 8], labels)
ax.scatter(xcoord, ycoord, c = colors, rasterized=True)
I think will replace most of your script.
Upvotes: 1
Reputation: 20353
If you're on a 32 bit OS or running 32 bit python, you will not be able to efficiently work with large data sets (installing 64 bit python, numpy, matplotlib etc may fix this).
However, I would suggest first trying to draw your picture at a lower resolution and seeing if that works for you (the results may be good enough). For example, I would first replace the j
iterator for j in range(int((float(sn)/80)*(l-1)), int((float(sn)/80)*(l))):
with something like
for j in np.linspace(int((float(sn)/80)*(l-1)), int((float(sn)/80)*(l), num=20):
j = int(j)
which will give you a range of 20 j
values within your limits, but not at each integer value. Note you will need to cast j
into an int
as it is likely to be a np.float
!
Other style remarks are less useful at this point, but in general you needn't del
often - python has a very good garbage collector that does this for you. You could also set your limits outside of the iterators - this may make debugging more straightforward:
start_j = int((float(sn)/80)*(l-1)))
end_j = int((float(sn)/80)*(l))
for j in np.linspace(start_j, end_j, num=20):
etc.
Upvotes: 0