Reputation: 916
I am trying to create a bunch of dummy variables with 2 conditions each based on the name of the variable But i am not sure how to proceed
I have the following dataset "dat"
ID Entry Exit y2000 y2001 y2002 y2003 ....
1 1999 2010 0 0 0 0
2 2000 2001 0 ......
3 2002 2003 0 ........
4 1999 2002
5 .....
at the moment all the y"i" variables are equal to 0 basically, what I want is to assign value 1 to variable y2000 if entry is lower or equal to 2000 and exit is higher or equal to 2000 similarly, for variable y2001 i want to assign value 1 if entry is lower or equal to 2001 and exit is higher or equal to 2001 and so on.
I can do it for a signle variable as follows:
dat$y2000[dat$exit >= 2000 & dat$enter <= 2000] <- 1
but I d like to do this in a loop for each variable of the type y"i", how can I do?
thank you in advance for yout help
Upvotes: 1
Views: 174
Reputation: 887501
We can do this with Map
. Get the vector of 'y' column names with grep
('nm1'), extract the numeric part from the name, use Map
to `replace the values in the corresponding 'y' column based on the logical condition created with 'enter/exit' columns and update the 'y' columns in the original dataset
nm1 <- grep("^y\\d{4}$", names(dat), value = TRUE)
nm2 <- as.integer(sub("y", "", nm1))
dat[nm1] <- Map(function(x, y) replace(dat[[x]],
dat$Exit >= y & dat$Entry <= y, 1), nm1, nm2)
Or using tidyverse
library(tidyverse)
dat %>%
gather(key, val, matches("^y")) %>%
mutate(colNum = readr::parse_number(key), %>%
val = +(Exit >= colNum & Entry <= colNum)) %>%
select(-colNum) %>%
spread(key, val)
dat <- structure(list(ID = c(1L, 2L, 3L, 5L), Entry = c(1999L, 2000L,
2002L, 1999L), Exit = c(2010L, 2001L, 2003L, 2002L), y2000 = c(0L,
0L, 0L, 0L), y2001 = c(0L, 0L, 0L, 0L), y2002 = c(0L, 0L, 0L,
0L), y2003 = c(0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA,
-4L))
Upvotes: 1