Deo Tuyisingize
Deo Tuyisingize

Reputation: 1

How can I correlate rainfall in month t with food availability in month t+1 or t+2?

I am trying to run a time series analysis in order to correlate rainfall in month t with food availability in month t+1 or t+2, using R. Could anyone help with that.

I tried autocorrelation and autoregression BUT I would like to correlate rainfall and food availability using the following data sets ( just examples).

data:

structure(list(Month = c("Jan-17", "Feb-17", "Mar-17", "Apr-17", 
"May-17", "Jun-17", "Jul-17", "Aug-17", "Sep-17", "Oct-17", "Nov-17", 
"Dec-17"), Rain = c(43, 78, 144.9, 124.7, 86.8, 0, 25.1, 48.9, 
125.4, 185.4, 185.5, 62.2), fruits = c(NA, NA, 14.02439024, 28.65853659, 
32.31707317, 12.60162602, 16.46341463, 21.95121951, 9.146341463, 
5.487804878, 6.097560976, 10.97560976)), class = "data.frame", row.names = c(NA, 
-12L))

Upvotes: 0

Views: 43

Answers (1)

Jan
Jan

Reputation: 5254

You are looking for cross-correlations: ccf(x, y) (look for it in the manual or look up more details here)

The following sample excludes the first two values because the function does not accept NA. It also limits the lags to -1:1, because you are only interested in a lag of -1 month. The lag value k returned by ccf(x, y) estimates the correlation between x[t+k] and y[t].

Df <- structure(list(Month = c("Jan-17", "Feb-17", "Mar-17", "Apr-17",
                               "May-17", "Jun-17", "Jul-17", "Aug-17",
                               "Sep-17", "Oct-17", "Nov-17", "Dec-17"),
                     Rain = c(43, 78, 144.9, 124.7, 86.8, 0, 25.1, 48.9,
                              125.4, 185.4, 185.5, 62.2),
                     fruits = c(NA, NA, 14.02439024, 28.65853659,
                     32.31707317, 12.60162602, 16.46341463, 21.95121951, 9.146341463,
                     5.487804878, 6.097560976, 10.97560976)),
                class = "data.frame", row.names = c(NA, -12L))
Result <- ccf(Df$Rain[-(1:2)], Df$fruits[-(1:2)], lag.max = 1, plot = FALSE)

These are your cross correlations. You could also change the argument plot = TRUE to receive a plot of the three values.

Result
#> 
#> Autocorrelations of series 'X', by lag
#> 
#>     -1      0      1 
#> -0.123 -0.322 -0.483

Of course, you can also do it by yourself. All you have to do is to subset the two vectors by selecting the according vector position. This way you can cross-check the ccf() result:

cor(Df$Rain[3:11], Df$fruits[4:12])
#> [1] -0.1276439

And this way you are also able to use all available values:

cor(Df$Rain[2:11], Df$fruits[3:12])
#> [1] -0.1181127

Please note, that ccf() and cor() results differ by a bit. The cross-correlation works differently. It's algorithm is based on the stationarity assumption which will lead to different results than cor(). That would become more pronounced once you use larger time lags than 1.

Created on 2021-03-14 by the reprex package (v1.0.0)

Upvotes: 1

Related Questions