Indexing a numpy array with a arrays

Question

I am trying to convert the vanilla python standard deviation function that takes n number of indexes defined by the variable number for calculations into numpy form. However the numpy code is faulty which is saying only integer scalar arrays can be converted to a scalar index is there any way i could by pass this.

Variables

import numpy as np
number = 5
list_= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

Vanilla python

std= np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])

Numpy form

counter = np.arange(0, len(list_)-number, 1)
std = list_[counter:counter+number].std()

hpaulj · Accepted Answer

In [46]: std= np.array([arr[i:i+number].std() for i in range(0, len(arr)-number)
    ...: ])
In [47]: std
Out[47]: 
array([22.67653383, 10.3940773 , 14.60076482, 13.82801944, 13.68038469,
       12.54834004, 13.13574418, 15.24698722, 14.65383773, 11.62092989,
        8.57331689,  4.76392583,  9.49404494, 21.20874383, 24.91417226,
       20.84991841, 13.22152789, 10.83343482, 16.01294245, 13.80007894,
       10.51866421,  8.29287433, 11.24933733, 15.43661128, 13.65945978])

We can move the std out of the loop. Make a 2d array of windows, and apply std with axis:

In [48]: np.array([arr[i:i+number] for i in range(0, len(arr)-number)]).std(axis
    ...: =1)
Out[48]: 
array([22.67653383, 10.3940773 , 14.60076482, 13.82801944, 13.68038469,
       12.54834004, 13.13574418, 15.24698722, 14.65383773, 11.62092989,
        8.57331689,  4.76392583,  9.49404494, 21.20874383, 24.91417226,
       20.84991841, 13.22152789, 10.83343482, 16.01294245, 13.80007894,
       10.51866421,  8.29287433, 11.24933733, 15.43661128, 13.65945978])

We could also generate the windows with indexing. A convenient way is to use linspace:

In [63]: idx = np.arange(0,len(arr)-number)
In [64]: idx = np.linspace(idx,idx+number,number, endpoint=False,dtype=int)
In [65]: idx
Out[65]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24],
         ...
       [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
        20, 21, 22, 23, 24, 25, 26, 27, 28]])
In [66]: arr[idx].std(axis=0)
Out[66]: 
array([22.67653383, 10.3940773 , 14.60076482, 13.82801944, 13.68038469,
       12.54834004, 13.13574418, 15.24698722, 14.65383773, 11.62092989,
        8.57331689,  4.76392583,  9.49404494, 21.20874383, 24.91417226,
       20.84991841, 13.22152789, 10.83343482, 16.01294245, 13.80007894,
       10.51866421,  8.29287433, 11.24933733, 15.43661128, 13.65945978])

The rolling-windows using as_strided will probably be faster, but may be harder to understand.

In [67]: timeit std= np.array([arr[i:i+number].std() for i in range(0, len(arr)-
    ...: number)])
1.05 ms ± 7.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [68]: timeit np.array([arr[i:i+number] for i in range(0, len(arr)-number)]).s
    ...: td(axis=1)
74.7 µs ± 108 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [69]: %%timeit
    ...: idx = np.arange(0,len(arr)-number)
    ...: idx = np.linspace(idx,idx+number,number, endpoint=False,dtype=int)
    ...: arr[idx].std(axis=0)
117 µs ± 240 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [73]: timeit np.std(rolling_window(arr, 5), 1)
74.5 µs ± 625 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

using a more direct way to generate the rolling index:

In [81]: %%timeit
    ...: idx = np.arange(len(arr)-number)[:,None]+np.arange(number)
    ...: arr[idx].std(axis=1)
57.9 µs ± 87.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

your error

In [82]: arr[np.array([1,2,3]):np.array([4,5,6])]
Traceback (most recent call last):
  File "", line 1, in 
    arr[np.array([1,2,3]):np.array([4,5,6])]
TypeError: only integer scalar arrays can be converted to a scalar index

Indexing a numpy array with a arrays

Answers (2)

your error

Related Questions