Reputation: 920
I wanted to remove closing parentheses, ")", in a string when there is no opening parenthesis ahead. Here are examples and my desired output.
string1= "1548-(2), 1549)"
string2= "1401-(15~145), 153), 156"
desired_string1= "1548-(2), 1549"
desired_string2= "1401-(15~145), 153, 156"
*Additional: Thank you for the answers, it works well but it doesn't for the following case:
additional = "1543-(2, 6), 1548-(2), 1549), 1334-(1~5), 1401-(15, 145)"
desired_additional = "1543-(2, 6), 1548-(2), 1549, 1334-(1~5), 1401-(15, 145)"
Upvotes: 1
Views: 120
Reputation: 21400
You can use lookbehind and lookahead:
library(stringr)
str_replace_all(str, "(?<=,[^)]{1,10})\\)(?=,|$)", "")
[1] "1548-(2), 1549" "1401-(15~145), 153, 156"
Here we remove only those )
s that (i) follow a ,
followed by 1-10 characters which are not )
and that (ii) precede either a ,
or the end of the string.
Data:
str = c("1548-(2), 1549)", "1401-(15~145), 153), 156")
EDIT:
If the data is more complicated, like this:
str = c("1548-(2), 1549))", "1401-(15~145), 153) ),) 156)")
you can use negative lookbehind:
str_replace_all(str, "(?<!\\([^)]{0,10})\\)\\)?", "")
[1] "1548-(2), 1549" "1401-(15~145), 153, 156"
Here we remove one or two consecutive )
s if there is no opening round parenthesis (
to the left at any distance between 0 and 10 characters (which must not include any )
).
Upvotes: 1
Reputation: 887118
We could create a function to get the location of the parentheses and if there is any unbalanced one, then the length
of one of them would be different, use that location in substr
or str_sub
to change it to ""
library(stringr)
f1 <- function(str1) {
i1 <- str_locate_all(str1, '\\(')[[1]][,1]
i2 <- str_locate_all(str1, '\\)')[[1]][,1]
i3 <- seq_along(i2) > length(i1)
if(any(i3)) {
i4 <- i2[i3 ]
str_sub(str1, i4, i4) <- ''
}
return(str1)
}
-testing
f1(string1)
#[1] "1548-(2), 1549"
f1(string2)
#[1] "1401-(15~145), 153, 156"
f1("((a))")
#[1] "((a))"
f1("(a))")
#[1] "(a)"
Using the updated 'additional'
paste(sapply(strsplit(additional, "(?<=\\)),", perl = TRUE)[[1]],
f1), collapse=",")
#[1] "1543-(2, 6), 1548-(2), 1549, 1334-(1~5), 1401-(15, 145)"
Upvotes: 1
Reputation: 2096
Here is another solution using stringr
package
fun <- function(x) {
library(stringr)
open_cnt = str_count(x, "\\(") # counts opening parentheses
close_cnt = str_count(x, "\\)") # counts closing parentheses
if(close_cnt > open_cnt){
diff_i = (open_cnt+1):close_cnt # indexes differences
diff_loc = str_locate_all(x, "\\)")[[1]][diff_i,1] # locates differences
paste0(unlist(str_split(x,""))[-diff_loc], collapse="") # rebuilds string
} else {x}
}
fun(string1)
#[1] "1548-(2), 1549"
fun(string2)
#[1] "1401-(15~145), 153, 156"
fun("((( 0__0 ))))))))))")
#[1] "((( 0__0 )))"
Upvotes: 1