Reputation: 11
I am using the acf function to calculate the autocorrelation provided by statsmodels.tsa.stattools.acf. One of the parameters is called missing and can take the values ‘none’, ‘raise’, ‘conservative’, and ‘drop’, which change how the function handles NaN values. The problem is, I can't find any documetation on how exactly what each of these values change the statistics that are being done.
I am working with an evenly spaced time series, which has a scattering of missing values and a large gap of missing measurements in the middle. My solution thus far has been to subtract the median from the time series to center it around zero and then insert 0 in all the missing values. Does one of these parameter values do a similar thing, and should I be handling things differently?
Upvotes: 1
Views: 1347
Reputation: 1
It is detailed in the ACF function docstring here
missing: str, default “none” A string in [“none”, “raise”, “conservative”, “drop”] specifying how the NaNs are to be treated. “none” performs no checks. “raise” raises an exception if NaN values are found. “drop” removes the missing observations and then estimates the autocovariances treating the non-missing as contiguous. “conservative” computes the autocovariance using nan-ops so that nans are removed when computing the mean and cross-products that are used to estimate the autocovariance. When using “conservative”, n is set to the number of non-missing observations.
Upvotes: 0
Reputation: 31
'none' does nothing to the NANs, 'raise' raises an error in case the data contains NANs.
MissingDataError("NaNs were encountered in the data")
'drop' removes the NANs from the code before calculation is proceeded and, as far as I understand the source code (https://www.statsmodels.org/stable/_modules/statsmodels/tsa/stattools.html), the option 'conservative' replaces all NANs by 0.
missing = missing.lower()
if missing not in ['none', 'raise', 'conservative', 'drop']:
raise ValueError("missing option %s not understood" % missing)
if missing == 'none':
deal_with_masked = False
else:
deal_with_masked = has_missing(x)
if deal_with_masked:
if missing == 'raise':
raise MissingDataError("NaNs were encountered in the data")
notmask_bool = ~np.isnan(x) # bool
if missing == 'conservative':
# Must copy for thread safety
x = x.copy()
x[~notmask_bool] = 0
else: # 'drop'
x = x[notmask_bool] # copies non-missing
notmask_int = notmask_bool.astype(int) # int
Upvotes: 1