Reputation: 376
I'm trying to create some simple and easy to write content-clusters with multiple regexes.
Imagine a list of strings: c("a","b","ac") The groups I need to define are "All: a's" and "All: b's". So the values "a" and "ac" are "A" and "b" is "B".
myDF$contentGroup <- sub(".*a.*", "A", myDF$stringList)
However this will result in a column within my dataframe "contentGroup" which contains the value of "stringList" if no match occured. So if I do the same line of code with "B" it will overwrite the "A"s.
myDF$contentGroup <- sub(".*b.*", "B", myDF$stringList)
I just cant figure out how to do simple clustering in a single line of code. Making it as simple as possible.
Upvotes: 0
Views: 123
Reputation: 51592
You can use grep
to match 'a' and 'b', and replace as follows,
x[grep('a', x, fixed = TRUE)] <- 'A'
x[grep('b', x, fixed = TRUE)] <- 'B'
x
#[1] "A" "B" "A"
Upvotes: 1