Reputation: 139
I would like to create two different datasets based on this dataset with text data.
id <- c(24, 24, 56, 56, 56, 56, 92, 92, 92)
visit_id <- c(1, 2, 1, 2, 3, 4, 1, 2, 3)
location <- c('Hospital', 'Hospital', 'Clinic', 'Hospital', 'Hospital',
'Hospital', 'Clinic', 'Hospital', 'Clinic')
data <- data.frame(id, visit_id, location)
For the first data set, my aim is to create a dummy variable that identifies patients whose first visit was at clinic and assign '1' to those who meet this criteria. So it would look something like this...
id <- c(24, 56, 92)
exclude <- c(0, 1, 1)
data1 <- data.frame(id, exclude)
For the second dataset, I would like to identify those who have a record of visiting either the hospital OR clinic and assign them with '0'.
id <- c(24, 56, 92)
exclude <- c(1, 0, 0)
data2 <- data.frame(id, exclude)
I am not familiar with loops and have some experience using conditional operators on numerical data.
Upvotes: 1
Views: 148
Reputation: 28955
You can use dplyr
package:
library(dplyr)
data %>% filter(visit_id == 1) %>%
mutate(exclude = if_else(location=="Clinic",1,0)) %>% select(id,exclude)
# id exclude
# 1 24 0
# 2 56 1
# 3 92 1
data %>% group_by(id) %>% mutate(exclude = ifelse(length(unique(location))==1,1,0)) %>%
select(id,exclude) %>% filter(row_number()==1)
# # A tibble: 3 x 2
# id exclude
# <dbl> <dbl>
# 1 24 1
# 2 56 0
# 3 92 0
Upvotes: 1