B. Koch
B. Koch

Reputation: 1

create new variables in a dataframe from existing variables using lists of names

I was wondering if there was a way to construction a function to do the following:

Original dataframe, df:

Obs Col1    Col2
1   Y       NA
2   NA      Y
3   Y       Y

Modified dataframe, df:

Obs Col1 Col2 Col1_YN Col2_YN
1   Y    NA   “Yes”   “No”
2   NA   Y    “No     “Yes”
3   Y    Y    “Yes”   “Yes”

The following code works just fine to create the new variables but I have lots of original columns with this structure and the “Yes” “No” format works better when constructing tables.

df$Col1_YN <- as.factor(ifelse(is.na(df$Col1), “No”, “Yes”))
df$Col2_YN <- as.factor(ifelse(is.na(df$Col2), “No”, “Yes”))

I was thinking along the lines of defining lists of input and output columns to be passed to a function, or using lapply but haven’t figured out how to do this.

Upvotes: 0

Views: 38

Answers (1)

akrun
akrun

Reputation: 887901

We can use across to loop over the columns and create the new columns by modifying the .names

library(dplyr)
df1 <- df %>% 
   mutate(across(-Obs,  
   ~ case_when(. %in% "Y" ~ "Yes", TRUE ~ "No"), .names = "{.col}_YN"))

-output

df1
  Obs Col1 Col2 Col1_YN Col2_YN
1   1    Y <NA>     Yes      No
2   2 <NA>    Y      No     Yes
3   3    Y    Y     Yes     Yes

If we want to use lapply, loop over the columns of interest, apply the ifelse and assign it back to new columns by creating a vector of new names with paste

df[paste0(names(df)[-1], "_YN")] <- 
   lapply(df[-1], \(x) ifelse(is.na(x), "No", "Yes"))

data

df <- structure(list(Obs = 1:3, Col1 = c("Y", NA, "Y"), Col2 = c(NA, 
"Y", "Y")), class = "data.frame", row.names = c(NA, -3L))

Upvotes: 4

Related Questions