Reputation: 141
Let's say I have panel data arranged as follows:
| ID | Year | Var1 |
|----|------|------|
|----|------|------|
| 1 | 2010 | 0 |
| 1 | 2012 | 1 |
--------------------
| 2 | 2010 | 3 |
| 2 | 2012 | 2 |
--------------------
| 3 | 2010 | 1 |
| 3 | 2012 | 3 |
Or, in R:
ID <- c(1, 1, 2, 2, 3, 3)
Year <- c(2010, 2012, 2010, 2012, 2010, 2012)
Var1 <- c(0, 1, 3, 2, 1, 3)
df <- data.frame(ID, Year, Var1)
I now create Var2 which conditionally assigns a 1 when Var1 in the second time period within each panel is larger than Var1 in the previous time period and 0 otherwise.
In the example table, when Var1 is greater in 2012 than Var1 in 2010 for each ID, I would assign a 1 in newly created Var2 and 0 otherwise.
It would look like this:
| ID | Year | Var1 | Var2 |
|----|------|------|------|
|----|------|------|------|
| 1 | 2010 | 0 | 0 |
| 1 | 2012 | 1 | 1 |
---------------------------
| 2 | 2010 | 3 | 0 |
| 2 | 2012 | 2 | 0 |
---------------------------
| 3 | 2010 | 1 | 0 |
| 3 | 2012 | 3 | 1 |
What would the R code look like to create Var2? I imagine there is an easy tidyverse method.
Upvotes: 1
Views: 37
Reputation: 50738
You can use diff
:
In the tidyverse
with group_by
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(Var2 = c(0, +(diff(Var1)) > 0))
## A tibble: 6 x 4
## Groups: ID [3]
# ID Year Var1 Var2
# <dbl> <dbl> <dbl> <dbl>
#1 1. 2010. 0. 0.
#2 1. 2012. 1. 1.
#3 2. 2010. 3. 0.
#4 2. 2012. 2. 0.
#5 3. 2010. 1. 0.
#6 3. 2012. 3. 1.
Or in base R using ave
transform(df, Var2 = ave(Var1, ID, FUN = function(x) c(0, +(diff(x)) > 0)))
# ID Year Var1 Var2
#1 1 2010 0 0
#2 1 2012 1 1
#3 2 2010 3 0
#4 2 2012 2 0
#5 3 2010 1 0
#6 3 2012 3 1
Upvotes: 1