teogj
teogj

Reputation: 343

How to separate a column by first digit

Say if we have a data frame looking like this:

name_and_age
Wesley Wycombe27 y.o.
Sally Atkinson33 y.o.
Matthew Lee42 y.o.

I would like to separate this into two columns, setting the data frame to look like this:

name                   age
Wesley Wycombe         27
Sally Atkinson         33
Matthew Lee            42

Until now I've been trying both with separate() and extract() as well while using [:digit:] and \\d as regex, however all my attempts had been unsuccessful.

Upvotes: 1

Views: 42

Answers (1)

akrun
akrun

Reputation: 887291

We can use extract to capture the characters as a group. Here, we specify the pattern to match one or more characters that are not a digit ([^0-9]+) from the start of the string (^), capture as a group ((...)) followed by digits as second group and not capture the rest of the characters (.*)

library(dplyr)
library(tidyr)
df1 %>%
   extract(name_and_age, into = c('name', 'age'),
            '^([^0-9]+)(\\d+).*', convert = TRUE)
#           name age
#1 Wesley Wycombe  27
#2 Sally Atkinson  33
#3    Matthew Lee  42

data

df1 <- structure(list(name_and_age = c("Wesley Wycombe27 y.o.", 
         "Sally Atkinson33 y.o.", 
"Matthew Lee42 y.o.")),
class = "data.frame", row.names = c(NA, 
-3L))

Upvotes: 2

Related Questions