Reputation: 225
So I am trying to carry out the following calculations on a series of large arrays but I keep getting the error:
MemoryError
In total there are 9 grain_size arrays 2745 by 2654 (Note: I could use just a single float here instead of an array as it is an array of the same number in every cell and this doesn't change), 9 g_pro arrays 2745 by 2654 and the 9 arrays I create below.
So I guess my questions would be is there a way to work around this issue?
# Create empty arrays to store the information
Fs1 = np.zeros_like(g_pro_1, dtype = float)
Fs2 = np.zeros_like(g_pro_1, dtype = float)
Fs3 = np.zeros_like(g_pro_1, dtype = float)
Fs4 = np.zeros_like(g_pro_1, dtype = float)
Fs5 = np.zeros_like(g_pro_1, dtype = float)
Fs6 = np.zeros_like(g_pro_1, dtype = float)
Fs7 = np.zeros_like(g_pro_1, dtype = float)
Fs8 = np.zeros_like(g_pro_1, dtype = float)
Fs9 = np.zeros_like(g_pro_1, dtype = float)
# Check where the condition is true
np.putmask(Fs1, np.logical_and(grain_size_1_array > 0.0000625, grain_size_1_array <= 0.002), g_pro_1)
np.putmask(Fs2, np.logical_and(grain_size_2_array > 0.0000625, grain_size_2_array <= 0.002), g_pro_2)
np.putmask(Fs3, np.logical_and(grain_size_3_array > 0.0000625, grain_size_3_array <= 0.002), g_pro_3)
np.putmask(Fs4, np.logical_and(grain_size_4_array > 0.0000625, grain_size_4_array <= 0.002), g_pro_4)
np.putmask(Fs5, np.logical_and(grain_size_5_array > 0.0000625, grain_size_5_array <= 0.002), g_pro_5)
np.putmask(Fs6, np.logical_and(grain_size_6_array > 0.0000625, grain_size_6_array <= 0.002), g_pro_6)
np.putmask(Fs7, np.logical_and(grain_size_7_array > 0.0000625, grain_size_7_array <= 0.002), g_pro_7)
np.putmask(Fs8, np.logical_and(grain_size_8_array > 0.0000625, grain_size_8_array <= 0.002), g_pro_8)
np.putmask(Fs9, np.logical_and(grain_size_9_array > 0.0000625, grain_size_9_array <= 0.002), g_pro_9)
Fs = Fs1 + Fs2 + Fs3 + Fs4 + Fs5 + Fs6 + Fs7 + Fs8 + Fs9
Fs[self.discharge == -9999] = -9999
The code that worked for me now is:
Fs = np.zeros_like(g_pro_1, dtype = float)
grain_array_list = [self.grain_size_1, self.grain_size_2, self.grain_size_3, self.grain_size_4, self.grain_size_5, self.grain_size_6, self.grain_size_7, self.grain_size_8, self.grain_size_9]
proportions_list = [g_pro_1, g_pro_2, g_pro_3, g_pro_4, g_pro_5, g_pro_6, g_pro_7, g_pro_8, g_pro_9]
for proportion, grain in izip(proportions_list, grain_array_list):
if grain > 0.0000625 and grain <= 0.002:
print grain
Fs = Fs + proportion
Fs[self.discharge == -9999] = -9999
Upvotes: 1
Views: 540
Reputation: 16109
Every time you see lines of code that only differ by a single character, you should be using a loop. In your case, you are holding data that you are not using in memory. Your workflow is basically
grain_size_array
grain_size_array
Fs
)grain_size_array
In terms of code, you need something like:
g_pro_1 = load() # however you get that
Fs = np.zeros_like(g_pro_1, dtype = float)
Fs_tmp = np.zeros_like(g_pro_1, dtype = float)
for i in range(10):
g_pro = load() # whatever
grain_size_array = load() # whatever
np.putmask(Fs_tmp, np.logical_and(grain_size_array > 0.0000625, grain_size_array <= 0.002), g_pro_1)
Fs += Fs_tmp
Upvotes: 1
Reputation: 13678
Your example requires 9*2745*2654*sizeof(float)
Bytes, i.e. 500 MiB, to store the grain_size
arrays and again as much to store the g_pro
arrays. To run the logical_and
functions, the parameter arrays with the results of the comparisons must be stored, adding another 100 Mib. Maybe you really simply run out of memory eventually?
You could either try
Fs<n>
arrays one after another rather than having each of them in memory at the same timeUpvotes: 1