Reputation: 189
I want to lag a value to two waves that have a
This is the data:
df <- data.frame(wave = c(16, 20, 20, 36, 40, 40),
area = c(1,1,1,1,1,1),
party = c("A", "A", "A", "B", "B", "B"),
loyal = c(NA, 0, 1, NA, 0, 1),
ipref_1 = c(0.25, 0.25, 0.5, -0.25, 0.5, -0.25),
ipref_2 = c(0.5, 0.25, -0.25, -0.75, 0.5, -0.25),
ipref_3 = c(-0.5, -0.25, 0, 0.5, -0.25, 0))
Which looks like:
wave area party loyal ipref_1 ipref_2 ipref_3
16 1 A NA 0.25 0.50 -0.50
20 1 A 0 0.25 0.25 -0.25
20 1 A 1 0.50 -0.25 0.00
36 1 B NA -0.25 -0.75 0.50
40 1 B 0 0.50 0.50 -0.25
40 1 B 1 -0.25 -0.25 0.00
I want to lag ipref_1, ipref_2, and ipref_3 of waves 16 and 36 to waves 20 and 40, respectively. But, there is an additional variable loyal
with scores 0
and 1
in the later wave. I want to lag the value of wave 16 (and 36) to both rows of wave 20 (and 40).
I tried:
df <- df %>% group_by(party, area) %>%
mutate(l_ipref_1 = lag(ipref_1, 1, order_by = wave, na.rm =T),
l_ipref_2 = lag(ipref_2, 1, order_by = wave, na.rm =T),
l_ipref_3 = lag(ipref_3, 1, order_by = wave, na.rm =T))
But it gives me:
wave area party loyal ipref_1 ipref_2 ipref_3 l_ipref_1 l_ipref_2 l_ipref_3
<dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
16 1 A NA 0.25 0.5 -0.5 NA NA NA
20 1 A 0 0.25 0.25 -0.25 0.25 0.5 -0.5
20 1 A 1 0.5 -0.25 0 0.25 0.25 -0.25
36 1 B NA -0.25 -0.75 0.5 NA NA NA
40 1 B 0 0.5 0.5 -0.25 -0.25 -0.75 0.5
40 1 B 1 -0.25 -0.25 0 0.5 0.5 -0.25
Here, in wave 20 (and 40) the score of loyal = 0
is lagged on loyal = 1
, but what I need is:
wave area party loyal ipref_1 ipref_2 ipref_3 l_ipref_1 l_ipref_2 l_ipref_3
<dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
16 1 A NA 0.25 0.5 -0.5 NA NA NA
20 1 A 0 0.25 0.25 -0.25 0.25 0.5 -0.5
20 1 A 1 0.5 -0.25 0 0.25 0.5 -0.5
36 1 B NA -0.25 -0.75 0.5 NA NA NA
40 1 B 0 0.5 0.5 -0.25 -0.25 -0.75 0.5
40 1 B 1 -0.25 -0.25 0 -0.25 -0.75 0.5
Any suggestions? Many thanks
Upvotes: 0
Views: 86
Reputation: 27792
here is a data.table
approach
the core of the apporach is the ifelse
-statement; ifelse(is.na(loyal), NA, x[1])
. Which means: if is.na(loyal) == TRUE, then return value NA, if not, return the first value of the ipref-column (from the current group).
Solution words for any number of columns starting with ipref
library(data.table)
cols <- grep("^ipref", names(df), value = TRUE)
setDT(df)[, paste0("l_", cols) :=
lapply(.SD, function(x) ifelse(is.na(loyal), NA, x[1])),
by = .(area, party), .SDcols = cols][]
# wave area party loyal ipref_1 ipref_2 ipref_3 l_ipref_1 l_ipref_2 l_ipref_3
# 1: 16 1 A NA 0.25 0.50 -0.50 NA NA NA
# 2: 20 1 A 0 0.25 0.25 -0.25 0.25 0.50 -0.5
# 3: 20 1 A 1 0.50 -0.25 0.00 0.25 0.50 -0.5
# 4: 36 1 B NA -0.25 -0.75 0.50 NA NA NA
# 5: 40 1 B 0 0.50 0.50 -0.25 -0.25 -0.75 0.5
# 6: 40 1 B 1 -0.25 -0.25 0.00 -0.25 -0.75 0.5
Upvotes: 2