Reputation: 687
Is there anyway I could convert the standard deviation function to be computed just like the y_mean
and xy_mean
functions. I don't want to use a for loop for calculating the standard deviation or a function that takes a lot of RAM memory. I am trying to use np.convolve()
function for calculating the standard deviation std
.
variables:
number = 5
PC_list= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])
Vanilla python functions:
y_mean = sum(PC_list[i:i+number])/number
xy_mean = sum([x * (i + 1) for i, x in enumerate(PC_list[i:i+number])])/number
std = (sum([(k - y_mean)**2 for k in PC_list[i:i+number]])/(number-1))**0.5
Numpy versions:
y_mean = (np.convolve(PC_list, np.ones(shape=(number)), mode='valid')/number)[:-1]
xy_mean = (np.convolve(PC_list, np.arange(number, 0, -1), mode='valid'))[:-1]
std = ?
Upvotes: 0
Views: 48
Reputation: 15872
You can use np.lib.stride_tricks.as_strided
and np.std
with ddof=1
:
>>> np.std(
np.lib.stride_tricks.as_strided(
PC_list,
shape=(PC_list.shape[0] - number + 1, number),
strides=PC_list.strides*2
),
axis=-1,
ddof=1
)
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
9.58525968, 5.32623099, 10.61466493, 23.71209646, 27.85489139,
23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
11.7602241 , 9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
9.02825105])
Otherwise you can move use pandas.Series.rolling.std
, pandas.Series.dropna
then pandas.Series.to_numpy
:
>>> pd.Series(PC_list).rolling(number).std().dropna().to_numpy()
array([25.35313557, 11.6209317 , 16.32415133, 15.46019574, 15.29513506,
14.02947067, 14.68620846, 17.04664993, 16.38348865, 12.9925946 ,
9.58525968, 5.32623099, 10.61466493, 23.71209646, 27.85489139,
23.31091745, 14.78211757, 12.11214834, 17.90301391, 15.42895731,
11.7602241 , 9.27171536, 12.57714149, 17.25865608, 15.2717403 ,
9.02825105])
EXPLANATION:
np.lib.stride_tricks.as_strided
is used to reshape the array in a special way, that resembles rolling:
>>> np.lib.stride_tricks.as_strided(
PC_list,
shape=(PC_list.shape[0] - number + 1, number),
strides=PC_list.strides*2
)
array([[457.334015, 424.440002, 394.79599 , 408.903992, 398.821014], #index: 0,1,2,3,4
[424.440002, 394.79599 , 408.903992, 398.821014, 402.152008], #index: 1,2,3,4,5
[394.79599 , 408.903992, 398.821014, 402.152008, 435.790985], #index: 2,3,4,5,6
[408.903992, 398.821014, 402.152008, 435.790985, 423.204987], # ... and so on
[398.821014, 402.152008, 435.790985, 423.204987, 411.574005],
[402.152008, 435.790985, 423.204987, 411.574005, 404.424988],
[435.790985, 423.204987, 411.574005, 404.424988, 399.519989],
[423.204987, 411.574005, 404.424988, 399.519989, 377.181 ],
[411.574005, 404.424988, 399.519989, 377.181 , 375.46701 ],
[404.424988, 399.519989, 377.181 , 375.46701 , 386.944 ],
[399.519989, 377.181 , 375.46701 , 386.944 , 383.61499 ],
[377.181 , 375.46701 , 386.944 , 383.61499 , 375.071991],
[375.46701 , 386.944 , 383.61499 , 375.071991, 359.511993],
[386.944 , 383.61499 , 375.071991, 359.511993, 328.865997],
[383.61499 , 375.071991, 359.511993, 328.865997, 320.51001 ],
[375.071991, 359.511993, 328.865997, 320.51001 , 330.07901 ],
[359.511993, 328.865997, 320.51001 , 330.07901 , 336.187012],
[328.865997, 320.51001 , 330.07901 , 336.187012, 352.940002],
[320.51001 , 330.07901 , 336.187012, 352.940002, 365.026001],
[330.07901 , 336.187012, 352.940002, 365.026001, 361.562012],
[336.187012, 352.940002, 365.026001, 361.562012, 362.299011],
[352.940002, 365.026001, 361.562012, 362.299011, 378.549011],
[365.026001, 361.562012, 362.299011, 378.549011, 390.414001],
[361.562012, 362.299011, 378.549011, 390.414001, 400.869995],
[362.299011, 378.549011, 390.414001, 400.869995, 394.77301 ],
[378.549011, 390.414001, 400.869995, 394.77301 , 382.556 ]])
Now if we take the std
of the above array across the last axis, to obtain the rolling std
. By default numpy
uses ddof=0
, i.e. Delta Degrees of Freedom = 0, which means for number
amount of samples, the divisor will be equal to number - 0
. Now as you want number - 1
, you need ddof=1
.
Upvotes: 1