Reputation: 13103
I would like to split strings between the last letter and first number:
dat <- read.table(text = "
x y
a1 0.1
a2 0.2
a3 0.3
a4 0.4
df1 0.1
df2 0.2
df13 0.3
df24 0.4
fcs111 0.1
fcs912 0.2
fcs113 0.3
fcsb8114 0.4",
header=TRUE, stringsAsFactors=FALSE)
desired.result <- read.table(text = "
x1 x2 y
a 1 0.1
a 2 0.2
a 3 0.3
a 4 0.4
df 1 0.1
df 2 0.2
df 13 0.3
df 24 0.4
fcs 111 0.1
fcs 912 0.2
fcs 113 0.3
fcsb 8114 0.4",
header=TRUE, stringsAsFactors=FALSE)
There are a number of similar questions on StackOverflow, but I cannot find this exact situation. I know this must be a basic question. If I put a couple of hours into it I could probably figure it out. Sorry. Thank you for any suggestions. I prefer base R. If this is a duplicate I can delete it.
Upvotes: 6
Views: 1481
Reputation: 121057
The stringr
package makes this slightly more readable. In the following example [[:alpha:]]
and [[:digit:]]
are locale-independent character classes for letters and numbers respectively.
library(stringr)
matches <- str_match(dat$x, "([[:alpha:]]+)([[:digit:]])")
desired.result <- data.frame(
x1 = matches[, 2],
x2 = as.numeric(matches[, 3]),
y = dat$y
)
Upvotes: 1
Reputation: 109844
A method using gsub
and strsplit
:
data.frame(do.call(rbind, strsplit(gsub("([a-zA-Z])([0-9])", "\\1_\\2",
dat$x), "_")), y = dat$y)
## X1 X2 y
## 1 a 1 0.1
## 2 a 2 0.2
## 3 a 3 0.3
## 4 a 4 0.4
## 5 df 1 0.1
## 6 df 2 0.2
## 7 df 13 0.3
## 8 df 24 0.4
## 9 fcs 111 0.1
## 10 fcs 912 0.2
## 11 fcs 113 0.3
## 12 fcsb 8114 0.4
Tis shows what's happening at each stage:
(a <- gsub("([a-zA-Z])([0-9])", "\\1_\\2", dat$x))
(b <- strsplit(a, "_"))
(d <- do.call(rbind, b))
data.frame(d, y = dat$y)
Upvotes: 2
Reputation: 17189
You can use strsplit
function and provide regex pattern for split
argument
cbind(dat, do.call(rbind, strsplit(dat$x, split = "(?<=[a-zA-Z])(?=[0-9])", perl = T)))
## x y 1 2
## 1 a1 0.1 a 1
## 2 a2 0.2 a 2
## 3 a3 0.3 a 3
## 4 a4 0.4 a 4
## 5 df1 0.1 df 1
## 6 df2 0.2 df 2
## 7 df13 0.3 df 13
## 8 df24 0.4 df 24
## 9 fcs111 0.1 fcs 111
## 10 fcs912 0.2 fcs 912
## 11 fcs113 0.3 fcs 113
## 12 fcsb8114 0.4 fcsb 8114
Upvotes: 4