Reputation: 343
Say if we have a data frame looking like this:
name_and_age
Wesley Wycombe27 y.o.
Sally Atkinson33 y.o.
Matthew Lee42 y.o.
I would like to separate this into two columns, setting the data frame to look like this:
name age
Wesley Wycombe 27
Sally Atkinson 33
Matthew Lee 42
Until now I've been trying both with separate()
and extract()
as well while using [:digit:]
and \\d
as regex, however all my attempts had been unsuccessful.
Upvotes: 1
Views: 42
Reputation: 887291
We can use extract
to capture the characters as a group. Here, we specify the pattern to match one or more characters that are not a digit ([^0-9]+
) from the start of the string (^
), capture as a group ((...)
) followed by digits as second group and not capture the rest of the characters (.*
)
library(dplyr)
library(tidyr)
df1 %>%
extract(name_and_age, into = c('name', 'age'),
'^([^0-9]+)(\\d+).*', convert = TRUE)
# name age
#1 Wesley Wycombe 27
#2 Sally Atkinson 33
#3 Matthew Lee 42
df1 <- structure(list(name_and_age = c("Wesley Wycombe27 y.o.",
"Sally Atkinson33 y.o.",
"Matthew Lee42 y.o.")),
class = "data.frame", row.names = c(NA,
-3L))
Upvotes: 2