Reputation: 1144
There are many posts about creating dummy variables, but in my case I have a set of columns similar to dummy variables which need recoding back into one column.
Given as set of categorical/string variables (counties in the USA):
a<-c(NA,NA,"Cameron","Luzerne");b<-c(NA,"Luzerne",NA,NA);c<-c("Chester",NA,NA,NA)
df<-as.data.frame(cbind(a,b,c))
How to create a function that can convert them to a single category? The function should work for any contiguous set of string columns.
Result should look like this:
newcol a b c
Chester <NA> <NA> Chester
Luzerne <NA> Luzerne <NA>
Cameron Cameron <NA> <NA>
Luzerne <NA> Luzerne <NA>
I wrote this function, which takes three arguments:
cn<-function(df,s,f){
for(i in seq_along(df[ ,c(s:f)]) ) # for specified columns in a dataframe...
ifelse(is.na(df[,i]),NA,df[ ,i] ) # return value if not NA
}
But it doesn't work. I've tried a variety of similar attempts. Fail.
The idea is to take a data frame with some number of string columns and move their values, if not blank, to the new column.
Upvotes: 2
Views: 533
Reputation: 886938
We can use coalesce
library(dplyr)
df %>%
mutate_all(as.character) %>%
mutate(newcolumn = coalesce(!!! .)) %>%
select(newcolumn, everything())
# newcolumn a b c
#1 Chester <NA> <NA> Chester
#2 Luzerne <NA> Luzerne <NA>
#3 Cameron Cameron <NA> <NA>
#4 Luzerne Luzerne <NA> <NA>
In base R
, an option is pmax
do.call(pmax, c(lapply(df, as.character), na.rm = TRUE))
#[1] "Chester" "Luzerne" "Cameron" "Luzerne"
Upvotes: 2