Adam
Adam

Reputation: 444

strip words out of a character

I have a character, “vars”.

vars=c("cogD", "relevel(cbsnivcat3f, \"Lower\")", "relevel(leidingf, \"geen\")", 
"relevel(ocdisf, \"Law\")")

I want to get only the words between the “(“ and the “,” so and up with the words between the bracket and the comma,e.g., relevel(cbsnivcat3f, \"Lower\") only "cbsnivcat3f"

my goal is: vars= c("cogD","cbsnivcat3f","leidingf”,"ocdisf")

Upvotes: 0

Views: 62

Answers (2)

Tyler Rinker
Tyler Rinker

Reputation: 109984

The rm_between function in a qdapRegex which I maintain, allows you to grab substrings between a left and right bound. That would work nicely for this situation. In the last sstep we replace any NAs with the original values.

library(qdapRegex)
out <- unlist(rm_between(vars, "(", ",", extract=TRUE))
out[is.na(out)] <- vars[is.na(out)]
out

## [1] "cogD"        "cbsnivcat3f" "leidingf"    "ocdisf" 

The regular expressiob behind the scenes is: "(().*?(,)" which can also be used with base, stringi or stringr approaches.

Upvotes: 0

Andrie
Andrie

Reputation: 179488

Try a regular expression:

gsub("relevel\\((.*?), .*", "\\1", vars)
[1] "cogD"        "cbsnivcat3f" "leidingf"    "ocdisf"   

Upvotes: 2

Related Questions