Sand
Sand

Reputation: 115

Extract string after first comma and store it in another column using R

I want to split text after the first comma and put it in another column of same dataframe.

s2 <- data.frame(text =c("Hi Prashant, As per the contract, employees can avail for various services like gym, recreation center, etc","Various dishes are available in canteen like pasta, rice dishes, etc"),stringsAsFactors = FALSE)

s2$new = gsub(".*,", "", s2)

But its splitting after last comma, which I don't want.

Expected output after splitting text after first comma and storing it another column called 'new' should look like:

first row: As per the contract, employees can avail for various services like gym, recreation center, etc.

second row: rice dishes, etc.

Upvotes: 2

Views: 4207

Answers (2)

Kerry Jackson
Kerry Jackson

Reputation: 1871

One way to do this is using the stringr library and the str_split_fixed function which splits a vector of strings into a matrix of substrings at the pattern match.

library(stringr)

s2$new <- str_split_fixed(s2$text, ",", 2)[,2]

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520928

Using sub, and make the dot lazy:

s2$new <- sub("^.*?,", "", s2$text)

Or, another way:

s2$new <- sub("^[^,]*,", "", s2$text)

The problem with your current pattern is that .* by default is greedy, meaning it will consume everything up until the last comma. But in your case, you want it to stop matching at the first comma.

Upvotes: 4

Related Questions