llb1706
llb1706

Reputation: 45

Remove whole sentences from dataset

I have a dataset that looks like this:

output
Others. Specify (separate by comma if there is more than one):
Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):
Family upbringing
Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):
Did not say

How can I remove the sentence "Others. Specify (separate by comma if there is more than one):" from the dataset? I've tried

gsub("Others. Specify (separate by comma if there is more than one):", "", datset$output)

and str_remove_all() but it didn't work.

Upvotes: 0

Views: 32

Answers (1)

stefan
stefan

Reputation: 124213

You could achieve your desired result by adding fixed=TRUE, which means to match the pattern as is

gsub("Others. Specify (separate by comma if there is more than one):", 
     "", 
     datset$output, 
     fixed = TRUE)
#> [1] ""                             "Everyone cries/has feelings,"
#> [3] "Family upbringing"            "Everyone cries/has feelings,"
#> [5] "Did not say"

Second option would be to escape all special characters which in your case are the . and in particualar the (), e.g. in a regex () are used to create a capturing group. Hence to match a e.g. ( you have to use \\(:

gsub("Others\\. Specify \\(separate by comma if there is more than one\\):", "", datset$output)

DATA

datset <- data.frame(
  output = c(
    "Others. Specify (separate by comma if there is more than one):",
    "Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):", "Family upbringing",
    "Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):", "Did not say"
  )
)

Upvotes: 1

Related Questions