firebird17139
firebird17139

Reputation: 141

R conditional value assignment within panel

Let's say I have panel data arranged as follows:

| ID | Year | Var1 |
|----|------|------|
|----|------|------|
| 1  | 2010 |  0   |
| 1  | 2012 |  1   |
--------------------
| 2  | 2010 |  3   |
| 2  | 2012 |  2   |
--------------------
| 3  | 2010 |  1   |
| 3  | 2012 |  3   |

Or, in R:

ID <- c(1, 1, 2, 2, 3, 3)

Year <- c(2010, 2012, 2010, 2012, 2010, 2012)

Var1 <- c(0, 1, 3, 2, 1, 3)

df <- data.frame(ID, Year, Var1)

I now create Var2 which conditionally assigns a 1 when Var1 in the second time period within each panel is larger than Var1 in the previous time period and 0 otherwise.

In the example table, when Var1 is greater in 2012 than Var1 in 2010 for each ID, I would assign a 1 in newly created Var2 and 0 otherwise.

It would look like this:

| ID | Year | Var1 | Var2 |
|----|------|------|------|
|----|------|------|------|
| 1  | 2010 |  0   |  0   |
| 1  | 2012 |  1   |  1   |
---------------------------
| 2  | 2010 |  3   |  0   |
| 2  | 2012 |  2   |  0   |
---------------------------
| 3  | 2010 |  1   |  0   |
| 3  | 2012 |  3   |  1   |

What would the R code look like to create Var2? I imagine there is an easy tidyverse method.

Upvotes: 1

Views: 37

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50738

You can use diff:

In the tidyverse with group_by

library(tidyverse)
df %>%
    group_by(ID) %>%
    mutate(Var2 = c(0, +(diff(Var1)) > 0))
## A tibble: 6 x 4
## Groups:   ID [3]
#     ID  Year  Var1  Var2
#  <dbl> <dbl> <dbl> <dbl>
#1    1. 2010.    0.    0.
#2    1. 2012.    1.    1.
#3    2. 2010.    3.    0.
#4    2. 2012.    2.    0.
#5    3. 2010.    1.    0.
#6    3. 2012.    3.    1.

Or in base R using ave

transform(df, Var2 = ave(Var1, ID, FUN = function(x) c(0, +(diff(x)) > 0)))
#  ID Year Var1 Var2
#1  1 2010    0    0
#2  1 2012    1    1
#3  2 2010    3    0
#4  2 2012    2    0
#5  3 2010    1    0
#6  3 2012    3    1

Upvotes: 1

Related Questions