R - create dynamic indicator columns from values in character columns

Question

I have data that looks like this:

library(dplyr)

d<-data.frame(ID=c(1,1,2,3,3,4), Quality=c("Good", "Bad", "Ugly", "Good", "Good", "Ugly"), Area=c("East", "North", "North", "South", "East", "North"))

What I'd like to do is create one new column for each unique value in Quality and populate it with whether the ID matches that value and then aggregate the ID's. I want to do the same for Area.

This is what I have for when Quality == Good:

d$Quality.Good <- 0
d$Quality.Good[d$Quality=="Good"] <- 1

e <- d %>% 
      group_by(ID) %>%
      summarise(n=n(), MAX.Quality.Good = max(Quality.Good))
e

Output

A tibble: 4 x 3
 ID       MAX.Quality.Good
      
1     1        1
2     2        0
3     3        1
4     4        0

Is it possible to build a function that will loop through each character column and build an indicator column for Good, Bad, Ugly, North, East, South instead of copy pasting the above many more times?

Here's where I'm stuck:

library(stringr)

#vector of each Quality
e <-d %>% 
  group_by(Quality) %>%
  summarise(n=n()) %>%
  select(Quality)
e<-as.data.frame(e)

#create new column names
f <- str_c(names(e),".",e[,1]) 

#initialize list of new columns
d[f] <- 0

#I'm stuck after this...

Thank you!

akrun · Accepted Answer

We can do this in base R using table by replicating the 'ID' column by the number of columns of dataset minus 1, and pasteing the column names with the unlisted values (excluding the 'ID' column)

table(rep(d$ID, 2), paste0(names(d)[-1][col(d[-1])], unlist(d[-1])))
#       AreaEast AreaNorth AreaSouth QualityBad QualityGood QualityUgly
#  1        1         1         0          1           1           0
#  2        0         1         0          0           0           1
#  3        1         0         1          0           2           0
#  4        0         1         0          0           0           1

or with tidyverse, gather into 'long' format, unite the 'key', 'val' columns to a single column, get the distinct rows, and spread into 'wide' format after creating a column of 1s.

library(tidyverse)
gather(d, key, val, -ID) %>%
   unite(kv, key, val) %>% 
   distinct %>%
   mutate(n = 1) %>% 
   spread(kv, n, fill = 0)
#ID Area_East Area_North Area_South Quality_Bad Quality_Good Quality_Ugly
#1  1         1          1          0           1            1            0
#2  2         0          1          0           0            0            1
#3  3         1          0          1           0            1            0
#4  4         0          1          0           0            0            1

R - create dynamic indicator columns from values in character columns

Answers (2)

Related Questions