code_error
code_error

Reputation: 83

How to calculate p-values from xarray.polyfit covariance matrix?

I am trying to compute p-values for the slope obtained using xarray.polyfit() by extracting the covariance matrix and using a t-test. However, I am getting 1s for all p-values, which seems incorrect. The same datasets give valid p-values when using scipy.stats.linregress() for specific points but slow to apply to the whole dataset.

import numpy as np
import xarray as xr
from scipy import stats

# Generate synthetic data
np.random.seed(42)
lat, lon = np.arange(-90, 91, 10), np.arange(-180, 181, 20)  # Coarse grid
time = np.arange(2000, 2021)  # Yearly data from 2000 to 2020

# Create a random dataset with a slight trend
trend = 0.05  # Define a small increasing trend
noise_level = 0.1

data_values = trend * (time - 2000)[
    None, None, :
] + noise_level * np.random.randn(  # Linear trend component
    len(lat), len(lon), len(time)
)  # Random noise

# Create an xarray DataArray
data = xr.DataArray(
    data_values,
    coords={"lat": lat, "lon": lon, "time": time},
    dims=["lat", "lon", "time"],
    name="t",
)


ds = xr.Dataset({"t": data})
ds


def compute_trends(data: xr.Dataset):
    trends = xr.Dataset()

    # Sort by time to ensure correct order
    data = data.sortby("time")
    n = data.dims["time"]  # Number of time points

    ds = data.polyfit(dim="time", deg=1, cov=True)

    slope = ds["t_polyfit_coefficients"].sel(degree=1)
    slope_variance = ds["t_polyfit_covariance"].sel(
        cov_i=1, cov_j=1
    )  # Variance of slope , i have tried with 0,0 but although the results are close to expected , they are not accurate so , i must be doing something wrong here.

    stderr = np.sqrt(slope_variance)
    t_stat = slope / stderr

    p_values = 2 * (1 - stats.t.cdf(np.abs(t_stat), df=n - 2))

    trends["slope"] = slope
    trends["p_value"] = xr.DataArray(p_values, coords=slope.coords, dims=slope.dims)

    return trends


# Compute trends
trends = compute_trends(ds)


print(trends.p_value)

I suspect an issue with how I’m extracting the covariance matrix (polyfit_covariance) or computing stderr. Am I correctly extracting the slope variance from polyfit_covariance?

Upvotes: -1

Views: 52

Answers (0)

Related Questions