R marginaleffects package: Estimate 95% CI for predicted minus counterfactual absolute/relative changes with an interrupted time series

Question

I am taking an online edX course on interrupted time series analysis that makes use of R and part of the course shows us how to derive predicted values from the gls model as well as get the absolute and relative change of the predicted vs the counterfactual.

Unfortunately, there is no example of how to get 95% confidence intervals around these predicted changes. On the course discussion board, the instructor linked to this article (Zhang et al, 2009.) where the authors provide SAS code, linked at the end of the 'Methods' section, to get these CIs, but the instructor does not have code that implements this in R.

Someone suggested I could use the R marginaleffects package to do this. I am trying to apply the materials from this course to a work project, but this one piece of getting CIs around the pred, cfac, and absolute/relative changes (pred - cfac) is stumping me. Just since I can't share my actual data from work, here's the example data the course gives us (csv), "flow" is the outcome, as in water flow of a river being studied:

year,flow,time,level,trend
1871,3533.916667,1,0,0
1872,3664.25,2,0,0
1873,3047.5,3,0,0
1874,3808.583333,4,0,0
1875,3677.5,5,0,0
1876,3663.416667,6,0,0
1877,2667.583333,7,0,0
1878,3900.083333,8,0,0
1879,4326.666667,9,0,0
1880,3590.583333,10,0,0
1881,3147.25,11,0,0
1882,2954.083333,12,0,0
1883,3498.416667,13,0,0
1884,3142.166667,14,0,0
1885,3213.916667,15,0,0
1886,3035,16,0,0
1887,3711.75,17,0,0
1888,2521.666667,18,0,0
1889,3023.5,19,0,0
1890,3507.5,20,0,0
1891,3493.666667,21,0,0
1892,4268.75,22,0,0
1893,3635.583333,23,0,0
1894,3946.416667,24,0,0
1895,3978.166667,25,0,0
1896,3856,26,0,0
1897,3248.166667,27,0,0
1898,3479.833333,28,1,1
1899,2453,29,1,2
1900,2649.166667,30,1,3
1901,2764.916667,31,1,4
1902,2196.666667,32,1,5
1903,2973.583333,33,1,6
1904,2613.416667,34,1,7
1905,2219.333333,35,1,8
1906,2901,36,1,9
1907,2186.75,37,1,10
1908,3215.333333,38,1,11
1909,3318.25,39,1,12
1910,3063.666667,40,1,13
1911,2605.583333,41,1,14
1912,2239.75,42,1,15
1913,1441.25,43,1,16
1914,2654.166667,44,1,17
1915,2191.75,45,1,18
1916,3570.333333,46,1,19
1917,3498.166667,47,1,20
1918,2564.583333,48,1,21
1919,2434.166667,49,1,22
1920,2594.333333,50,1,23
1921,2415.166667,51,1,24
1922,2683.583333,52,1,25
1923,2733.416667,53,1,26
1924,2729.416667,54,1,27
1925,2138.083333,55,1,28
1926,2665.916667,56,1,29
1927,2306.25,57,1,30
1928,2495.416667,58,1,31
1929,3241.083333,59,1,32
1930,2334.416667,60,1,33

This is what the model looks like, the instructor uses the gls function from the nlme package:

# Fit the GLS regression model
model_p10 <- gls(flow ~ time + level + trend,
    data=data,
    correlation=corARMA(p=10,form=~time),
    method="ML")

summary(model_p10)

Output:

Coefficients:

                 Value         p-value
(Intercept)      3348.033      0.0000
time                6.745      0.2871
level            -738.245      0.0000
trend             -14.210      0.0397

This is what the model is doing graphically:

That predicted changes code is then ran to calculate the predicted value 25 years AFTER the "weather change" i.e. the interruption, indicated by the level variable in the data which flips from 0 to 1 at time=28, thus there are 27 years before this "weather change"/interruption, 27 years BEFORE the weather change + 25 years AFTER the weather change = 52 year.

To clarify, the idea of the counterfactual here is not simply that level=0, but that the post-interruption change in level and trend/slope did not occur (i.e. the level and trend coefficients that encode the change in level and trend after the interruption should not be included in the counterfactual calculation).

# Predicted value at 25 years after the weather change
pred <- fitted(model_p10)[52]
# Then estimate the counterfactual at the same time point
cfac <- model_p10$coef[1] + model_p10$coef[2]*52
# Absolute change at 25 years
pred - cfac
# Relative change at 25 years
(pred - cfac) / cfac

Output:

> # Absolute change at 25 years
> pred - cfac
52
-1093.483

> # Relative change at 25 years
> (pred - cfac) / cfac
52
-0.2956351

Questions:

I see that the marginaleffects package's predictions( ) function will give me the pred values and 95% CIs around them. How would I then also get CIs for the counterfactual (cfac)?
And how would I specify the comparisons( ) function to get CIs for the absolute and relative (i.e. %) change between the predicted vs counterfactual (i.e. pred - cfac)? When I tried it out it kept giving me the same difference no matter the time point I input.

I did the following:

library(marginaleffects) 
comparisons(model_orig_q1, variables = "level",   newdata = datagrid(time = 52))

Also, this is a bit unrelated, but is there a way to get level and trend coefficients in the model, which currently give the absolute differences between the pre vs post interruption level and trend respectively, as a relative percentage with 95% CIs using the marginaleffects package? So instead of saying the trend in water flow changed by -14.2 (-10, -21), I could instead say it decreased by 10% (5%, 15%) or whatever it'd be.

R marginaleffects package: Estimate 95% CI for predicted minus counterfactual absolute/relative changes with an interrupted time series

Answers (1)

Related Questions