alittleboy
alittleboy

Reputation: 10956

separate characters concatenated with + in R

I have a factor vector that looks like this:

[1] A
[2] B+C
[3] A+D+E
[4] F
...

I would like to have a vector of items that is separated by the "+" sign, i.e. in the example above, I want to have a vector A B C D E F. The "+" signs are removed and duplicate of items are also removed. How can I do that in R? Thanks!

Upvotes: 0

Views: 117

Answers (2)

agstudy
agstudy

Reputation: 121568

Another option using scan and you get a vector of letters from which you remove duplicated using unique:

unique(scan(text='A
B+C
A+D+E
F',what='character',sep='+'))
Read 7 items
[1] "A" "B" "C" "D" "E" "F"

Upvotes: 5

Arun
Arun

Reputation: 118799

You should split the vector using strsplit at + (by escaping it). Then, you'll get a list where each element of the vector is split at +. Then you can unlist it to get back the vector. From there you can call unique to remove duplicates.

x <- c("A", "B+C", "B+D", "A+D+E", "F", "F+C")
unique(unlist(strsplit(x, "\\+")))
# [1] "A" "B" "C" "D" "E" "F"

Upvotes: 5

Related Questions