Reputation: 35
I'm currently writing a code to use strsplit to separate letters from integers, as one of my exam practice / study session (ungraded for this matter, and I wasn't able to grasp the concept just yet).
I tried:
unlist(strsplit(s, "(?<=[a-zA-Z])(?=[0-9])"))
but this doesn't work.
also tried
unlist(strsplit(s, ""))
but this gives me just vector of char
, not essentially separating letters from integers.
For example, instead of "w17u2"
becoming "w"
, "1"
, "7"
, "u"
, "2"
I need it to be "w"
, "17"
, "u"
, "2"
.
There won't be any specific pattern to the input, so it must be able to separate letters from integers at any pattern.
Upvotes: 1
Views: 79
Reputation: 588
You could also use strsplit twice, say:
splitnums <- function(s) {
v1 <- strsplit(s, '\\d+')[[1]] # "aa" "ss" "d" "f"
v2 <- strsplit(s, '\\D+')[[1]] # "" "2" "3" "22" "5"
if (v1[1] == "") return(c(rbind(v2, v1[2:length(v1)])))
else return(c(rbind(v1, v2[2:length(v2)])))
}
splitnums('aa2ss3d22f5')
# [1] "aa" "2" "ss" "3" "d" "22" "f" "5"
Upvotes: 1
Reputation: 50668
An option is to use look-aheads/look-behinds
ss <- "w17u2"
unlist(strsplit(ss, "((?<=[a-z])(?![a-z])|(?<=\\d)(?!\\d))", perl = T))
#[1] "w" "17" "u" "2"
Explanation:
(?<=[a-z])(?![a-z])
splits the string at the position where the preceding character matches [a-z]
and the following character does not match [a-z]
. Similarly, (?<=\\d)(?!\\d)
splits the string at the position where the preceding character matches a digit and the following character does not match a digit. The final regular expression is the OR concatenation of both regex patterns.
Upvotes: 2