Reputation: 8880
I am trying to convert an expression such as [[a], [b]]
into list(c(a), c(b))
(basically a java dictionary into R list). As a first step, I would like to convert each inner expression [a]
into an equivalent c(a)
. According to How to replace square brackets with curly brackets using R's regex?, I can use a nice regular expression "\\[(.*?)\\]"
or also \\[([^]]*)\\]
.
This will work when there is only one []
parenthesis, but not multiple ones like [[
as it will capture the first, resulting in "c([a), c(b])"
instead of "[c(a), c(b)]"
. How can I make sure I am only matching the inner parenthesis in a call that contains multiple [[], []]
?
vec <- c("[a]", "[[a], [b]]")
gsub("\\[(.*?)\\]", "c(\\1)", vec)
#> [1] "c(a)" "c([a), c(b])"
gsub("\\[([^]]*)\\]", "c(\\1)", vec)
#> [1] "c(a)" "c([a), c(b)]"
Created on 2021-02-15 by the reprex package (v0.3.0)
Upvotes: 0
Views: 130
Reputation: 160407
While Remove any text inside square brackets in r suggests how to deal with the regex itself, it doesn't address the "nested" component of the problem.
You can run it multiple times until there are no more changes.
vec <- c("[a]", "[[a], [b]]")
(vec2 <- gsub("\\[([^][]*)\\]", "c(\\1)", vec))
# [1] "c(a)" "[c(a), c(b)]"
(vec3 <- gsub("\\[([^][]*)\\]", "c(\\1)", vec2))
# [1] "c(a)" "c(c(a), c(b))"
The change is to disallow both opening [
and closing ]
brackets in the regex, which should only match the inner-most (no brackets).
It should be feasible to nest this in a while
loop that exits as soon as no change is detected.
Upvotes: 2