Reputation: 47
a, b, and c are sentences
Column
a,b,c
b,c
a,c
c
I want to separate and count each value, to obtain:
column a column b column c
yes yes yes
no yes yes
yes no yes
no no yes
Upvotes: 1
Views: 107
Reputation: 102710
A base R option
u <- strsplit(df$Column, ",")
res <- t(sapply(u, function(x) table(factor(x, levels = unique(unlist(u))))))
res[] <- c("no", "yes")[res + 1]
gives
> res
a b c
[1,] "yes" "yes" "yes"
[2,] "no" "yes" "yes"
[3,] "yes" "no" "yes"
[4,] "no" "no" "yes"
Upvotes: 2
Reputation: 1486
library(tidyverse)
df <- tibble::tribble(
~Column,
"a,b,c",
"b,c",
"a,c",
"c"
)
df %>%
rownames_to_column() %>%
separate_rows(Column, sep = ",") %>%
pivot_wider(
names_from = Column,
values_from = Column,
values_fn = function(x) if_else(!is.na(x), "yes", "no"),
values_fill = "no"
) %>%
select(-rowname)
#> # A tibble: 4 x 3
#> a b c
#> <chr> <chr> <chr>
#> 1 yes yes yes
#> 2 no yes yes
#> 3 yes no yes
#> 4 no no yes
Upvotes: 1
Reputation: 887901
We can split the column with strsplit
, use mtabulate
to get the frequency count and change the values of binary to 'yes', 'no
library(qdapTools)
out <- mtabulate(strsplit(df1$Column, ","))
out[] <- c("no", "yes")[as.matrix(out) + 1]
-output
out
# a b c
#1 yes yes yes
#2 no yes yes
#3 yes no yes
#4 no no yes
df1 <- structure(list(Column = c("a,b,c", "b,c", "a,c", "c")),
class = "data.frame", row.names = c(NA,
-4L))
Upvotes: 5