Reputation: 9190
In the following example, I initialize a 500x2000x2000 three-dimensional NumPy array named a
. At each iteration, a random two-dimensional array r
is inserted into array a
. This example represents a larger piece of code where the r
array would be created from various calculations during each iteration of the for-loop. Consequently, each slice in the z dimension of the a
array is calculated at each iteration.
# ex1_basic.py
import numpy as np
import time
def main():
tic = time.perf_counter()
z = 500 # depth
x = 2000 # rows
y = 2000 # columns
a = np.zeros((z, x, y))
for i in range(z):
r = np.random.rand(x, y)
a[i] = r
toc = time.perf_counter()
print('elapsed =', round(toc - tic, 2), 'sec')
if __name__ == '__main__':
main()
This example's memory is profiled using the memory-profiler package. The steps for running the memory-profiler for this example are:
# Run the memory profiler
$ mprof run ex1_basic.py
# Plot the memory profiler results
$ mprof plot
The memory usage is plotted below. The memory usage increases over time as the values are added to the array.
I profiled another example where the data type for the array is defined as np.float32
. See below for the example code and memory usage plot. This decreased the overall memory use but the memory still increases with each iteration.
import numpy as np
import time
def main():
rng = np.random.default_rng()
tic = time.perf_counter()
z = 500 # depth
x = 2000 # rows
y = 2000 # columns
a = np.zeros((z, x, y), dtype=np.float32)
for i in range(z):
r = rng.standard_normal((x, y), dtype=np.float32)
a[i] = r
toc = time.perf_counter()
print('elapsed =', round(toc - tic, 2), 'sec')
if __name__ == '__main__':
main()
Since I initialized array a
using np.zeros
, I would expect the memory usage to remain constant based on the block of memory initialized for that array. But it appears that memory usage increases as values are inserted into array a
.
So I have two questions related to these examples:
a
on disk and only have slice a[i]
and the r
array in memory at each iteration? Basically, how would I run these examples if the a
array did not fit in memory (RAM)?I ran an example using numpy.memmap
but there is no improvement in memory usage. It seems like memmap is still keeping the entire array in memory.
import numpy as np
import time
def main():
rng = np.random.default_rng()
tic = time.perf_counter()
z = 500
x = 2000
y = 2000
a = np.memmap('file.dat', dtype=np.float32, mode='w+', shape=(z, x, y))
for i in range(z):
r = rng.standard_normal((x, y), dtype=np.float32)
a[i] = r
toc = time.perf_counter()
print('elapsed =', round(toc - tic, 2), 'sec')
if __name__ == '__main__':
main()
Upvotes: 0
Views: 723
Reputation: 9190
Using the h5py
package, I can create an hdf5 file that contains a dataset that represents the a
array. The dset
variable is similar to the a
variable discussed in the question. This allows the array to reside on disk, not in memory. The generated hdf5 file is 8 GB on disk which is the size of the array containing np.float32
values. The elapsed time for this approach is similar to the examples discussed in the question; therefore, writing to the hdf5 file seems to have a negligible performance impact.
import numpy as np
import h5py
import time
def main():
rng = np.random.default_rng()
tic = time.perf_counter()
z = 500 # depth
x = 2000 # rows
y = 2000 # columns
f = h5py.File('file.hdf5', 'w')
dset = f.create_dataset('data', shape=(z, x, y), dtype=np.float32)
for i in range(z):
r = rng.standard_normal((x, y), dtype=np.float32)
dset[i, :, :] = r
toc = time.perf_counter()
print('elapsed time =', round(toc - tic, 2), 'sec')
s = np.float32().nbytes * (z * x * y) / 1e9 # where 1 GB = 1000 MB
print('calculated storage =', s, 'GB')
if __name__ == '__main__':
main()
Output from running this example on a MacBook Pro with 2.6 GHz 6-Core Intel Core i7 and 32 GB of RAM:
elapsed time = 22.97 sec
calculated storage = 8.0 GB
Running the memory profiler for this example gives the plot shown below. The peak memory usage is about 100 MiB which is drastically lower than the examples demonstrated in the question.
Upvotes: 1