Adeline Morisot
Adeline Morisot

Reputation: 47

separate and count a column with multiple values

a, b, and c are sentences

Column
a,b,c
b,c
a,c
c

I want to separate and count each value, to obtain:

column a   column b   column c
yes         yes        yes
no          yes        yes
yes         no         yes
no          no         yes

Upvotes: 1

Views: 107

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 102710

A base R option

u <- strsplit(df$Column, ",")
res <- t(sapply(u, function(x) table(factor(x, levels = unique(unlist(u))))))
res[] <- c("no", "yes")[res + 1]

gives

> res
     a     b     c
[1,] "yes" "yes" "yes"
[2,] "no"  "yes" "yes"
[3,] "yes" "no"  "yes"
[4,] "no"  "no"  "yes"

Upvotes: 2

crestor
crestor

Reputation: 1486

library(tidyverse)

df <- tibble::tribble(
  ~Column,
  "a,b,c",
  "b,c",
  "a,c",
  "c"
)

df %>%
  rownames_to_column() %>%
  separate_rows(Column, sep = ",") %>%
  pivot_wider(
    names_from = Column,
    values_from = Column,
    values_fn = function(x) if_else(!is.na(x), "yes", "no"),
    values_fill = "no"
  ) %>%
  select(-rowname)

#> # A tibble: 4 x 3
#>   a     b     c    
#>   <chr> <chr> <chr>
#> 1 yes   yes   yes  
#> 2 no    yes   yes  
#> 3 yes   no    yes  
#> 4 no    no    yes

Upvotes: 1

akrun
akrun

Reputation: 887901

We can split the column with strsplit, use mtabulate to get the frequency count and change the values of binary to 'yes', 'no

library(qdapTools)
out <- mtabulate(strsplit(df1$Column, ","))
out[] <-  c("no", "yes")[as.matrix(out) + 1]

-output

out
#    a   b   c
#1 yes yes yes
#2  no yes yes
#3 yes  no yes
#4  no  no yes

data

df1 <- structure(list(Column = c("a,b,c", "b,c", "a,c", "c")), 
class = "data.frame", row.names = c(NA, 
-4L))

Upvotes: 5

Related Questions