Sharif Amlani
Sharif Amlani

Reputation: 1288

Lagging a variable by adding up the previous 5 years?

I am working with data that look like this:

Country Year       Aid
Angola 1995 416420000
Angola 1996 459310000
Angola 1997 354660000
Angola 1998 335270000
Angola 1999 387540000
Angola 2000 302210000

I want to create a lagged variable by adding up the previous five years in the data

So that the observation for 2000 looks like this:

Country Year       Aid  Lagged5
Angola 2000  416420000  1953200000

Which was derived by adding the Aid observations from 1995 to 1999 together:

416420000 + 459310000 + 354660000 + 335270000 + 387540000 = 1953200000

Also, I will need to group by country as well.

Thank You!

Upvotes: 0

Views: 50

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269734

Using the input DF shown reproducibly in the Note at the end define a roll function which sums the prior 5 rows and use ave to run it for each Country. The width argument list(-seq(5)) to rollapplyr means use offsets -1, -2, -3, -4, -5 in summing, i.e. the values in the prior 5 rows.

The question did not discuss what to do with the initial rows in each country so we put in NA values but if you want partial sums add the partial = TRUE argument to rollapplyr. You can also change the fill=NA to some other value if you wish so it is quite flexible.

library(zoo)

roll <- function(x) rollapplyr(x, list(-seq(5)), sum, fill = NA)
transform(DF, Lag5 = ave(Aid, Country, FUN = roll))

Note

The input was assumed to be the following. We added a second country.

Lines <- "Country Year       Aid
Angola 1995 416420000
Angola 1996 459310000
Angola 1997 354660000
Angola 1998 335270000
Angola 1999 387540000
Angola 2000 302210000"
DF <- read.table(text = Lines, header = TRUE, strip.white = TRUE, 
  colClasses = c("character", "integer", "numeric"))
DF <- rbind(DF, transform(DF, Country = "Belize"))

Upvotes: 0

arg0naut91
arg0naut91

Reputation: 14764

You could do:

library(dplyr)

df %>%
  group_by(Country) %>%
  mutate(Lagged5 = sapply(Year, function(x) sum(Aid[between(Year, x - 5, x - 1)])))

Output:

# A tibble: 6 x 4
# Groups:   Country [1]
  Country  Year       Aid    Lagged5
  <chr>   <int>     <int>      <int>
1 Angola   1995 416420000          0
2 Angola   1996 459310000  416420000
3 Angola   1997 354660000  875730000
4 Angola   1998 335270000 1230390000
5 Angola   1999 387540000 1565660000
6 Angola   2000 302210000 1953200000

Upvotes: 2

Related Questions