Reputation: 1
I was wondering if there was a way to construction a function to do the following:
Original dataframe, df:
Obs Col1 Col2
1 Y NA
2 NA Y
3 Y Y
Modified dataframe, df:
Obs Col1 Col2 Col1_YN Col2_YN
1 Y NA “Yes” “No”
2 NA Y “No “Yes”
3 Y Y “Yes” “Yes”
The following code works just fine to create the new variables but I have lots of original columns with this structure and the “Yes” “No” format works better when constructing tables.
df$Col1_YN <- as.factor(ifelse(is.na(df$Col1), “No”, “Yes”))
df$Col2_YN <- as.factor(ifelse(is.na(df$Col2), “No”, “Yes”))
I was thinking along the lines of defining lists of input and output columns to be passed to a function, or using lapply but haven’t figured out how to do this.
Upvotes: 0
Views: 38
Reputation: 887901
We can use across
to loop over the columns and create the new columns by modifying the .names
library(dplyr)
df1 <- df %>%
mutate(across(-Obs,
~ case_when(. %in% "Y" ~ "Yes", TRUE ~ "No"), .names = "{.col}_YN"))
-output
df1
Obs Col1 Col2 Col1_YN Col2_YN
1 1 Y <NA> Yes No
2 2 <NA> Y No Yes
3 3 Y Y Yes Yes
If we want to use lapply
, loop over the columns of interest, apply the ifelse
and assign it back to new columns by creating a vector of new names with paste
df[paste0(names(df)[-1], "_YN")] <-
lapply(df[-1], \(x) ifelse(is.na(x), "No", "Yes"))
df <- structure(list(Obs = 1:3, Col1 = c("Y", NA, "Y"), Col2 = c(NA,
"Y", "Y")), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 4