chagag
chagag

Reputation: 65

Separate a column into 2 columns at the last specific character in R (appearing different times)

I have a dataframe like this

id <-c("1","2","3")
col <- c("hello, my 5 year old son is joe, 76","hello world, 55","can't say I didn't, 3")
df <- data.frame(id,col)

I am hoping to divide col into only two columns, one that takes only the numbers after the comma (but no other number) and the other takes the response. So my desired output is:

id     text                             nunber
1     hello, my 5 year old son is joe.  76
2     hello world                       55
3     can't say I didn't                3

I've tried:

separate(col, into=c("text","number"), ",(?=[^_]+$)")

but it obviously cuts the text with the comma before.

Any suggestions?

Upvotes: 0

Views: 518

Answers (2)

akrun
akrun

Reputation: 887511

We can use separate with a regex lookaround to match the , followed by zero or more spaces (\\s*) and one or more digits at the end ($) of the string inside the lookaround

library(dplyr)
library(tidyr)
df %>%
    separate(col, into = c('text', 'number'), ',\\s*(?=[0-9]+$)', convert = TRUE)

-output

 id                             text number
1  1 hellow, my 5 year old son is joe     76
2  2                      hello world     55
3  3               can't say I didn't      3

Upvotes: 1

Karthik S
Karthik S

Reputation: 11594

using extract:

df %>% extract(col = 'col', into=c("text","number"), regex = '(.*),\\s(\\d+$)')
  id                             text number
1  1 hellow, my 5 year old son is joe     76
2  2                      hello world     55
3  3               can't say I didn't      3

Upvotes: 2

Related Questions