Reputation: 599
test.dat <- c("abcde", "abcXe", "abcdY", "abcXY", "abYcXY", "abcYX")
test.want <- c("abcde", "abc1Xe", "abcd1Y", "abc1XY", "abYc1XY", "abcY1X")
Suppose I wish to add "1" before "X" or "Y", and only before "X" if both "X" and "Y" exist.
library(tidyverse)
case_when(
str_detect(test.dat, "X") ~ str_replace(test.dat, "X", "1X"),
str_detect(test.dat, "Y") ~ str_replace(test.dat, "Y", "1Y"),
TRUE ~ as.character(test.dat)
)
This works but is there a better way to do this in concise manner? Perhaps in single str_replace
?
How about a second scenario if it was either "X" or "Y" whichever comes first?
test.dat <- c("abcde", "abcXe", "abcdY", "abcXY", "abYcXY", "abcYX")
test.want <- c("abcde", "abc1Xe", "abcd1Y", "abc1XY", "ab1YcXY", "abc1YX")
stringr is preferable but I welcome any other methods. Thank you.
Upvotes: 3
Views: 108
Reputation: 39707
You can use a look ahead with (?=X)
for X
and (?=Y)
for Y
and make the decission if there is an X
with ifelse
and grepl
.
test.dat <- c("abcde", "abcXe", "abcdY", "abcXY", "abYcXY", "abcYX", "YXXdY")
ifelse(grepl("X", test.dat)
, sub("(?=X)", "1", test.dat, perl=TRUE)
, sub("(?=Y)", "1", test.dat, perl=TRUE))
#[1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X" "Y1XXdY"
or
sub("(?=X)|(?=Y(?!.*X))", "1", test.dat, perl=TRUE)
#[1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X" "Y1XXdY"
Where (?=X)
matches a position before X
and (?=Y(?!.*X))
matches a position before Y
which has no X
at any position afterwards.
In case not only the first hit should be used:
ifelse(grepl("X", test.dat)
, gsub("(?=X)", "1", test.dat, perl=TRUE)
, gsub("(?=Y)", "1", test.dat, perl=TRUE))
#[1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X" "Y1X1XdY"
or
gsub("(?=X)|(^[^X]*)(?=Y(?!.*X))", "\\11", test.dat, perl=TRUE)
#[1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X" "Y1X1XdY"
And to match X
or Y
whichever comes first:
sub("(?=X)|(?=Y)", "1", test.dat, perl=TRUE)
#sub("(?=X|Y)", "1", test.dat, perl=TRUE) #Alternative
#sub("(?=[XY])", "1", test.dat, perl=TRUE) #Alternative
#[1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "ab1YcXY" "abc1YX" "1YXXdY"
Upvotes: 5
Reputation: 627100
You can use
test.dat <- c("abcde", "abcXe", "abcdY", "abcXY", "abYcXY", "abcYX")
sub("^([^XY]*)(Y)([^X]*)$|(.*)(X)", "\\1\\41\\3\\5\\2", test.dat)
# => [1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X"
stringr::str_replace(test.dat, "^([^XY]*)(Y)([^X]*)$|(.*)(X)", "\\1\\41\\3\\5\\2")
# => [1] "abcde" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X"
See the regex demo.
Here,
^([^XY]*)(Y)([^X]*)$
- start of string (^
), Group 1: any zero or more chars other than X
and Y
(([^XY]*)
), Group 2: Y
((Y)
), Group 3: any zero or more chars other than X
(([^X]*)
), end of string ($
)|
- or(.*)
- Group 4: any zero or more chars as many as possible(X)
- Group 5: X
char.See the online R demo.
If you need to add 1 to the end of strings not having X
or Y
:
test.dat <- c("abcde", "abcXe", "abcdY", "abcXY", "abYcXY", "abcYX")
sub("^([^XY]*)$", "\\11", sub("^([^XY]*)(Y)([^X]*)$|(.*)(X)", "\\1\\41\\3\\5\\2", test.dat))
library(stringr)
str_replace(str_replace(test.dat, "^([^XY]*)(Y)([^X]*)$|(.*)(X)", "\\1\\41\\3\\5\\2"), "^([^XY]*)$", "\\11")
See this R demo. Output:
[1] "abcde1" "abc1Xe" "abcd1Y" "abc1XY" "abYc1XY" "abcY1X"
Upvotes: 2