mon
mon

Reputation: 22254

numpy - explanation of "np.copy is a shallow copy and will not copy object elements within arrays"

Background

Note that np.copy is a shallow copy and will not copy object elements within arrays. This is mainly important for arrays containing Python objects. The new array will contain the same object which may lead to surprises if that object can be modified (is mutable):

Based on 3.2 Memory layout, I suppose array a would be pointing to a memory region that contain the 64 bit float numbers.

Said differently, an array is mostly a contiguous block of memory whose parts can be accessed using an indexing scheme.

enter image description here

Because the document said np.copy is a shallow copy and will not copy object elements within arrays, I supposed the memory region will not be copied but referenced.

However, the copy is not a view to the original array and the memory region/address are different because x[0] == z[0] is False after x[0] = 10```.

x = np.array([1, 2, 3])
z = np.copy(x)
print(f"x.base {x.base} z.base {z.base}")
---
x.base None z.base None

x[0] = 10
print(x[0] == z[0])
---
False

print(f"x.base {x.base} z.base {z.base}")
---
x.base None z.base None

Questions

What is np.copy(a) actually copying?

If it is a shallow copy, why a different memory area is allocated to z[0]? Is it actually copying objects of the array x which should be called deep copy? Or is it copy-on-write?

Or does shallow copy mean, "when an element of np array is a mutable container such as list, then np.copy will not deep-copy the content of the container"?

Please help understand what exactly z = np.copy(x) does and what happen when assignment operations occur to z.

Upvotes: 3

Views: 1815

Answers (1)

Michael
Michael

Reputation: 2367

In python, the copy.copy() method performs a shallow copy, i.e., only objects of a primitive data types (e.g., int, float etc.) are coppyied, but constitute objects (e.g., dictionaries) are not, but instead are referenced trhough pointers.

On the other hand, the copy.deepcopy() method will copy everything (including composite data structures such as dictionaries), and will result in a new object identical to the first.

In case of the numpy module, if you use it for numerical computation as you should (i.e., compute data of one of the numerical dtypes) copy will be equivalent to deepcopy (as all the numeric items of numpy are primitive dtypes), and will result in a new object identical to the first, which you see as a new data space being allocated for the copied item.

Cheers.

Upvotes: 3

Related Questions