Reputation: 317
I am using an administrative dataset for a welfare program that provides a wage subsidy for workers. And I am trying to create a Y variable, whereby 1 equals a person who no longer receives a subsidy, and 0 equals a person currently receiving a subsidy, where end_date=NA. I will be doing this using two variables: 1-start_date and 2-end_date.
I have tried the following code, but I am getting an error message:
train_worker_subsidy5_categorical_y = train_worker_subsidy5 %>%
mutate(left_welfare = numeric(is.na(end_date)))
test_worker_subsidy5_categorical_y = test_worker_subsidy5 %>%
mutate(left_welfare = numeric(is.na(end_date)))
The error message is:
Error in numeric(is.na(end_date)) : invalid 'length' argument
Upvotes: 0
Views: 31
Reputation: 121
If I am understanding your question I would use this approach.
df <- data.frame('start_date' = as.Date(c('2018-01-01','2019-02-01',
'2019-03-01','2019-04-01')),
'end_date' = as.Date(c('2019-01-01',NA,'2019-08-01',
'2020-01-01')))
today <- Sys.Date()
df %>% mutate('receiving' = if_else(is.na(df$end_date),0,
if_else(df$end_date > today,0,1)))
start_date end_date receiving
1 2018-01-01 2019-01-01 1
2 2019-02-01 <NA> 0
3 2019-03-01 2019-08-01 1
4 2019-04-01 2020-01-01 0
It is hard to fully understand the question with out any reproducible code. Hope this helps.
Upvotes: 1