Juscallmesteve
Juscallmesteve

Reputation: 157

Why is Numpy much faster at creating a Zero array compared to replacing the values of an existing array with zeros?

I have an array which is used to track various values. The array is 2500x1700 in size, so it is not very large. At the end of a session I need to reset all of the values within that array back to zero. I tried both creating a new array of zeros and replacing all values in the array with zeros and creating a brand new array is much faster.

Code Example:

for _ in sessions:
    # Reset our array
    tracking_array[:,:] = 0

1.44 s ± 19.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Versus

for _ in sessions:
    # Reset our array
    tracking_array = np.zeros(shape=(2500, 1700))

7.26 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Why is creating an entirely new array so much faster when compared to just replacing the values in the array?

Upvotes: 5

Views: 1528

Answers (1)

Jérôme Richard
Jérôme Richard

Reputation: 50308

The reason is that the array is not filled in memory on mainstream operating systems (Windows, Linux and MaxOS). Numpy allocates a zero-filled array by requesting to the operating systems (OS) a zero-filled area in virtual memory. This area is not directly mapping in physical RAM. The mapping and zero-initialization is generally done lazily by the OS when you read/write the pages in virtual memory. This cost is paid when you set later the array to 1 for example. Here is a proof:

In [19]: %timeit res = np.zeros(shape=(2500, 1700))
10.8 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [20]: %timeit res = np.ones(shape=(2500, 1700))
7.54 ms ± 151 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The former would imply a RAM throughput of at least 4.2 GiB/s which is not high but fair. The latter would imply a RAM throughput of at least roughly 2930 GiB/s which is stupidly high since my machine (as well as any standard desktop/server machine) is barely able to reach 36 GiB/s (using a carefully-optimized benchmark).

Upvotes: 8

Related Questions