Reputation: 487
I am performing a plot as below:
for i in range(len(classederror)):
plt.scatter(xlag, classederror[i, :])
plt.show()
with the sizes of the variables being:
xlag = np.array(2, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250)
xlag.size = (11,)
classederror
= 176501 rows x 11 columnsHowever, I get memory problem and it is due to the large size of classederror
.
Is there a pythonic/more efficient way of doing this without having problem with memory?
WHAT I AM TRYING TO DO
As seen in the image below, the x-axis is xlag
and the y-axis is classederror
I want to plot each row in classederror
for a range of x-axis values and study the distribution of the data and finally i Should obtain something similar to image below.
Upvotes: 0
Views: 415
Reputation: 339350
It is of course much more efficient to plot a single scatter plot than 176501 scatter plots.
import numpy as np
import matplotlib.pyplot as plt
xlag = np.array([2, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250])
classederror = (np.random.randn(176501, 11)*25)*(0.2+np.sort(np.random.rand(11)))
plt.scatter(np.tile(xlag,len(classederror)), classederror.flatten())
plt.show()
Given the limited information one can draw from such a plot, it may make sense to directly plot 11 lines.
import numpy as np
import matplotlib.pyplot as plt
xlag = np.array([2, 25, 50, 75, 100, 125, 150, 175, 200, 225, 250])
classederror = (np.random.randn(176501, 11)*25)*(0.2+np.sort(np.random.rand(11)))
vals = np.c_[classederror.min(axis=0),classederror.max(axis=0)].T
x= np.c_[xlag,xlag].T
plt.plot(x,vals, color="C0", lw=2)
plt.show()
To obtain information about the density of points, one may use other means, e.g. a violin plot.
plt.violinplot(classederror, xlag, points=50, widths=20,
showmeans=True, showextrema=True, showmedians=True)
Upvotes: 2