Reputation: 55
df <- data.frame(
sample = c("A", "B", "C", "D"),
class = c("A31", "A3", "A31,C", "A3,B"),
value = c(5,1,6,8) )
I would like to in this df replace all the strings containing A3
(e.g. A3,B
) to just A3
without affecting A31
and A31,C
?
I’ve tried df$class <- gsub("A3.*", "A3", df$class)
, but this also changes A31
and A31,C
.
Upvotes: 0
Views: 123
Reputation: 886948
We could capture the upper case letter ([A-Z]
) followed by a digit as a group ((...)
) that precedes one or more digits (\\d+
) till the end ($
) of the string and replace with the backreference (\\1
) of the captured group
library(dplyr)
library(stringr)
df <- df %>%
mutate(class = str_replace(class, "^([A-Z]\\d)\\d+$", "\\1"))
-output
df
sample class value
1 A A3 5
2 B A3 1
3 C A31,C 6
4 D A3,B 8
Upvotes: 1