user1658170
user1658170

Reputation: 858

split a column into numeric and non-numeric components

I need to split one column into 2 where the the resulting columns contain the numeric or character portions of the original column.

df <- data.frame(myCol = c("24 hours", "36days", "1month", "2 months +"))

 myCol
 24 hours
 36days
 1month
 2 months +

result should be:

alpha   numeric
hours      24
days       36
month      1
months +   2

Note the inconsistent formatting of the original dataframe (sometimes with spaces, sometimes without).

tidy or base solutions are fine

Thanks

Upvotes: 0

Views: 586

Answers (2)

MKa
MKa

Reputation: 2318

You could do:

library(stringr)
df$numeric <- str_extract(df$myCol, "[0-9]+")
df$alpha <- str_remove(df$myCol, df$numeric)

Or with base functions

df$numeric <- regmatches(df$myCol, regexpr("[0-9]+", df$myCol))
df$alpha <- gsub("[0-9]+", "", df$myCol)

Upvotes: 0

Matt
Matt

Reputation: 2987

One solution could be:

library(tidyverse)
df %>%
    separate(myCol,
           into = c("numeric", "alpha"),
           sep = "(?=[a-z +]+)(?<=[0-9])"
)

Which returns:

  numeric  alpha
1    24     hours
2    36      days
3     1     month
4     2  months +

Upvotes: 4

Related Questions