Xaser
Xaser

Reputation: 2146

Does numpy always create a copy when writing self-referenced data

Consider this code

k = 1000000000
a = np.arange(k)
a[:-1] = a[1:]

while one can devise some code that gets this done without a copy array, I suspect that numpy creates a copy of the affect part of the array for this operation, correct?

Would this also be the case, when the source and target range do not overlap, like so:

k = 1000000000
a = np.arange(k)
a[:k//2] = a[k//2:]

This could easily be done in place.

Upvotes: 0

Views: 54

Answers (1)

hpaulj
hpaulj

Reputation: 231530

This is buffered:

In [112]: a = np.arange(12)
     ...: a[2:12] = a[0:10]
In [113]: a
Out[113]: array([0, 1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Compare this to a unbuffered result:

In [114]: a = np.arange(12)
     ...: for i in range(10):
     ...:     a[2+i] = a[i]
     ...: 
In [115]: a
Out[115]: array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1])

a[2] is changed before it is used in the copy.

a[0:10] is a view

Forcing a copy does slow things down, though not much for this small case:

In [116]: %%timeit
     ...: a = np.arange(12)
     ...: a[2:12] = a[0:10]
     ...: 
     ...: 
2.56 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [117]: %%timeit
     ...: a = np.arange(12)
     ...: a[2:12] = a[0:10].copy()
     ...: 
     ...: 
3.41 µs ± 9.74 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Upvotes: 1

Related Questions