ck1987pd
ck1987pd

Reputation: 269

Creating numpy shallow copies with arithmetic operations

I noticed that array operations with an identity elements return a copy (possibly a shallow copy) of the array.

Consider the code snippet below.

a=np.arange(16).reshape([4,4])
print(a)
b=a+0
print(b)
a[2,2]=200
print(a)
print(b)

We see that b is a shallow copy of a. I don't know if it is a deep copy, because I think matrix is a subtype of array, rather than array of arrays.

If I only need a shallow copy,

I know this is a frequently asked topic, but I couldn't find an answer for my particular question.

Thanks in advance!

Upvotes: 0

Views: 134

Answers (2)

hpaulj
hpaulj

Reputation: 231540

With numpy arrays there isn't a difference between shallow copy and deep copy - unless you are working with object dtype arrays (which practically speaking are lists).

There is an important distinction between view and copy. In

 a=np.arange(16).reshape([4,4])

a is actually a view of the 1d array produced by arange (check a.base)

The b=a action is basic Python. b is just another way to reference the same object.

b=a[:] is a view; a new array with a shared databuffer (same arange base).

b=a.copy(), and b=a+0 are both new arrays without any sharing. As long as you get the dtypes right, these are functionally the same.

a+0 translates to np.add(a,0). np.add.identity is 0, so as @chepner wrote, it does have the information to "optimize", but only to the equivalent of copy. "optimizing" to b=a breaks too much basic Python.

Besides being more explicit (to the human reader), copy will be faster:

In [19]: timeit b=a.copy()
658 ns ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [20]: timeit b=a+0
2.76 µs ± 79 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Upvotes: 1

Daweo
Daweo

Reputation: 36630

Is there a difference between using np.copy() and arithmetic operations?

Yes, consider following example

import numpy as np
arr = np.array([[True,False],[False,True]])
arr_c = np.copy(arr)
arr_0 = arr + 0
print(arr_c)
print(arr_0)

output

[[ True False]
 [False  True]]
[[1 0]
 [0 1]]

observe that both operations are legal (did not cause exception or error) yet give different results.

Upvotes: 1

Related Questions