Reputation: 55
I have Panel Data consisting of two waves: 18 and 21. I have a status of employment having 4 values.
I want to create a dummy taking value 1 if the person is employed in both waves and zero otherwise. However, I fail the code produces a dummy having only zero values:
df$dummy <- df %>%
group_by(NEW_id) %>%
arrange(New_id, WAVE_NO) %>%
mutate(dummy = case_when(WAVE_NO==18 & WAVE_NO==21 & EMPLOYMENT_STATUS=="Employed" ~ 1, TRUE ~ 0))
Upvotes: 0
Views: 438
Reputation: 1456
We may use split
to split the dataframe by id
. As split
returns a list, we can use lapply
to perform some operation on each element of that list (here: creating the dummy variable). The output of lapply
will be a list as well. However, we want a data.frame
, so we make a call to do.call()
, which performs some action on all elements of a list at once (here: rbind
).
set.seed(1)
n <- 10L
K <- 2L
df <- data.frame(
id = rep(1L:n, each=K),
wave = rep(c(18L,21L), n),
employment = sample(c('Employed', 'Unemployed'), n*K, replace = TRUE)
)
# add dummy to data frame
df <- do.call(rbind, lapply(split(df, df$id), function(x) {
x$dummy <- ifelse(x$employment %in% 'Employed', 1L, 0L)
x$dummy <- ifelse(sum(x$dummy) == 2L, 1L, 0L)
return(x)
}))
rownames(df) <- NULL
Output
> head(df)
id wave employment dummy
1 1 18 Employed 0
2 1 21 Unemployed 0
3 2 18 Employed 1
4 2 21 Employed 1
5 3 18 Unemployed 0
6 3 21 Employed 0
Upvotes: 1
Reputation: 8880
df <- data.frame(
stringsAsFactors = FALSE,
id = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L),
wave = c(18L, 21L, 18L, 21L, 18L, 21L, 18L, 10L, 18L, 21L),
EMPLOYMENT_STATUS = c(
"Employed",
"Employed",
"unemployed",
"Employed",
"unemployed",
"Employed",
"Employed",
"Employed",
"unemployed",
"unemployed"
)
)
library(tidyverse)
df %>%
group_by(id) %>%
mutate(dummy = +(all(wave %in% c(18, 21)) &
all(EMPLOYMENT_STATUS == "Employed"))) %>%
ungroup()
#> # A tibble: 10 x 4
#> id wave EMPLOYMENT_STATUS dummy
#> <int> <int> <chr> <int>
#> 1 1 18 Employed 1
#> 2 1 21 Employed 1
#> 3 2 18 unemployed 0
#> 4 2 21 Employed 0
#> 5 3 18 unemployed 0
#> 6 3 21 Employed 0
#> 7 4 18 Employed 0
#> 8 4 10 Employed 0
#> 9 5 18 unemployed 0
#> 10 5 21 unemployed 0
Created on 2022-01-23 by the reprex package (v2.0.1)
Upvotes: 0