daddymaterial
daddymaterial

Reputation: 35

strsplit: split strings from integers

I'm currently writing a code to use strsplit to separate letters from integers, as one of my exam practice / study session (ungraded for this matter, and I wasn't able to grasp the concept just yet).

I tried:

unlist(strsplit(s, "(?<=[a-zA-Z])(?=[0-9])"))

but this doesn't work.

also tried

unlist(strsplit(s, ""))

but this gives me just vector of char, not essentially separating letters from integers.

For example, instead of "w17u2" becoming "w", "1", "7", "u", "2" I need it to be "w", "17", "u", "2".

There won't be any specific pattern to the input, so it must be able to separate letters from integers at any pattern.

Upvotes: 1

Views: 79

Answers (2)

Yosi Hammer
Yosi Hammer

Reputation: 588

You could also use strsplit twice, say:

splitnums <- function(s) {
  v1 <- strsplit(s, '\\d+')[[1]] # "aa" "ss" "d"  "f"
  v2 <- strsplit(s, '\\D+')[[1]] # ""   "2"  "3"  "22" "5" 
  if (v1[1] == "") return(c(rbind(v2, v1[2:length(v1)])))
  else return(c(rbind(v1, v2[2:length(v2)])))
}

splitnums('aa2ss3d22f5')
# [1] "aa" "2"  "ss" "3"  "d"  "22" "f"  "5" 

Upvotes: 1

Maurits Evers
Maurits Evers

Reputation: 50668

An option is to use look-aheads/look-behinds

ss <- "w17u2"

unlist(strsplit(ss, "((?<=[a-z])(?![a-z])|(?<=\\d)(?!\\d))", perl = T))
#[1] "w"  "17" "u"  "2"

Explanation:

(?<=[a-z])(?![a-z]) splits the string at the position where the preceding character matches [a-z] and the following character does not match [a-z]. Similarly, (?<=\\d)(?!\\d) splits the string at the position where the preceding character matches a digit and the following character does not match a digit. The final regular expression is the OR concatenation of both regex patterns.

Upvotes: 2

Related Questions