Hackel
Hackel

Reputation: 133

Get unique values from a csv cell containing comma separated text

I have a csv file that contains multiple values in some cell

test <- c("a, b", "c", "d", "e, f", "g")
data.frame(test)

  test
1 a, b
2    c
3    d
4 e, f
5    g

When I used the unique() function to get all the unique values from the data frame it returned

[1] "a, b" "c"    "d"    "e, f" "g"  

However, I want it to be like this

[1] "a" "b" "c" "d" "e" "f" "g"  

Upvotes: 1

Views: 65

Answers (1)

akrun
akrun

Reputation: 886948

We can split the 'test' and get the unique

unique(unlist(strsplit(test, ",\\s*")))
#[1] "a" "b" "c" "d" "e" "f" "g"

In tidyverse, we can also do

library(tibble)
library(dplyr)
library(tidyr)
tibble(col1 = test) %>%
    separate_rows(col1) %>%
    distinct

Upvotes: 2

Related Questions