Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names

Question

library(tidyr)
library(dplyr)
library(tidyverse)

Below is the code for a simple dataframe. I have some messy data that was exported with column factor categories spread out in different columns.

Client<-c("Client1","Client2","Client3","Client4","Client5")
Sex_M<-c("Male","NA","Male","NA","Male")
Sex_F<-c(" ","Female"," ","Female"," ")
Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied")
Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ")
CommunicationType_Email<-c("Email"," "," ","Email","Email")
CommunicationType_Phone<-c(" ","Phone ","Phone "," "," ")
DF<-tibble(Client,Sex_M,Sex_F,Satisfaction_Satisfied,Satisfaction_VerySatisfied,CommunicationType_Email,CommunicationType_Phone)

I want to recombine the categories into single columns using tidyr's "unite".

DF<-DF%>%unite(Sat,Satisfaction_Satisfied,Satisfaction_VerySatisfied,sep=" ")%>%
unite(Sex,Sex_M,Sex_F,sep=" ")

However, I have to write multiple "unite" lines and I feel this violates the three times rule, so there must be a way to make this easier, especially since my real data contains dozens of columns that need to be combined. Is there a way to use "unite" once but somehow refer to matching column names so that all column names that are similar (For example, containing "Sex" for "Sex_M" and "Sex_F", and "CommunicationType" for "CommunicationType_Email" and "CommunicationType_Phone") are combined with the above formula?

I was also thinking about a function that allows me to enter column names, but this is too difficult for me since it involves complex standard evaluation.

akrun · Accepted Answer

We can use unite

library(tidyverse)
DF %>% 
    unite(Sat, matches("^Sat"))

For multiple cases, perhaps

gather(DF, Var, Val, -Client, na.rm = TRUE) %>%
        separate(Var, into = c("Var1", "Var2")) %>%
        group_by(Client, Var1) %>% 
        summarise(Val = paste(Val[!(is.na(Val)|Val=="")], collapse="_")) %>%
        spread(Var1, Val)
#  Client CommunicationType  Satisfaction    Sex
#*                           
#1 Client1             Email     Satisfied   Male
#2 Client2             Phone VerySatisfied Female
#3 Client3             Phone VerySatisfied   Male
#4 Client4             Email     Satisfied Female
#5 Client5             Email     Satisfied   Male

Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names

Answers (2)

Related Questions

Combining Multiple Columns with Tidyr&#39;s Unite by Referencing Similar Column Names

Answers (2)

Related Questions

Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names