R Separate column based on pattern

Question

My dataset looks like this -

dataset = data.frame(Comments=c('Wow... Loved this place.   1','Crust is not good.  0','Not tasty and the texture was just nasty.   0'))

I'm trying to split the dataset into two columns such that the first column contains only the text and the second column contains the only the number at the end of each string.

Here's my attempt

library(dplyr)
library(tidyr)

dataset = dataset %>%
  separate(Comments, into = c("Comment", "Score"), sep = " (?=[^ ]+$)")

However I'm not getting a perfect separation. I've looked at other solutions online, but no luck yet.

Any help on this would be greatly appreciated.

bjorn2bewild · Accepted Answer

Perhaps you could use substr and gsub

dataset <- dataset %>%
  mutate(Comments = as.character(Comments)) %>%
  mutate(Score = substr(Comments, nchar(Comments), nchar(Comments))) %>%
  mutate(Comment = gsub("\s\d", "", Comments))

R Separate column based on pattern

Answers (2)

Related Questions