Reputation: 1489
I have a set of character vectors:
a <- "bmi + ch | study"
b <- "bmi * ch | study"
c <- "bmi * ch - 1 | study"
d <- "bmi * ch + 0 | study"
e <- "bmi:ch + 0 | study"
In this example, I want to extract the two strings "bmi"
and "ch"
, i.e. the desired output is c("bmi", "ch")
The strings above are just examples; the character elements to be extracted can be anything else other than ch
and bmi
. I'm looking for a general solution, without hard-coding.
I have tried unlist(stringr::str_extract_all(a, "bmi|ch"))
. However, here I manually define the pattern "bmi|ch"
to achieve the desired output. Thus, it's not a general solution.
Upvotes: 2
Views: 611
Reputation: 13319
This is a bit more complicated and not efficient. I will just leave it here in case someone may find it interesting.
vecs<-list(a,b, c,d,e)
split_me<-Map(function(x) gsub("([a-z].*[a-z])(\\W.*)","\\1",x,
perl=TRUE), vecs)
lapply(split_me, function(x)
unlist(strsplit(gsub("\\s", "",x), "[+*:]")))
Result
[[1]]
[1] "bmi" "ch"
[[2]]
[1] "bmi" "ch"
[[3]]
[1] "bmi" "ch"
[[4]]
[1] "bmi" "ch"
[[5]]
[1] "bmi" "ch"
Data
a <- "bmi + ch | study"
b <- "bmi * ch | study"
c <- "bmi * ch - 1 | study"
d <- "bmi * ch + 0 | study"
e <- "bmi:ch + 0 | study"
vecs<-list(a,b, c,d,e)
Upvotes: 2
Reputation: 270248
Assume the vector v defined in the Note at the end. Then we can lapply over it using the indicated function. If the number of variables is always the same you could alternately use sapply giving a matrix.
lapply(sub("\\|.*", "", v), function(x) all.vars(parse(text = x)))
giving:
[[1]]
[1] "bmi" "ch"
[[2]]
[1] "bmi" "ch"
[[3]]
[1] "bmi" "ch"
[[4]]
[1] "bmi" "ch"
[[5]]
[1] "bmi" "ch"
a <- "bmi + ch | study"
b <- "bmi * ch | study"
c <- "bmi * ch - 1 | study"
d <- "bmi * ch + 0 | study"
e <- "bmi:ch + 0 | study"
v <- c(a, b, c, d, e)
Upvotes: 5