John legend2
John legend2

Reputation: 920

remove close parenthesis in a string when parenthesis is not opened in r

I wanted to remove closing parentheses, ")", in a string when there is no opening parenthesis ahead. Here are examples and my desired output.

string1= "1548-(2), 1549)"
string2= "1401-(15~145), 153), 156"

desired_string1= "1548-(2), 1549"
desired_string2= "1401-(15~145), 153, 156"

*Additional: Thank you for the answers, it works well but it doesn't for the following case:

additional = "1543-(2, 6), 1548-(2), 1549), 1334-(1~5),  1401-(15, 145)"
desired_additional = "1543-(2, 6), 1548-(2), 1549, 1334-(1~5),  1401-(15, 145)"

Upvotes: 1

Views: 120

Answers (3)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

You can use lookbehind and lookahead:

library(stringr)
str_replace_all(str, "(?<=,[^)]{1,10})\\)(?=,|$)", "")
[1] "1548-(2), 1549"          "1401-(15~145), 153, 156"

Here we remove only those )s that (i) follow a , followed by 1-10 characters which are not ) and that (ii) precede either a ,or the end of the string.

Data:

str = c("1548-(2), 1549)", "1401-(15~145), 153), 156")

EDIT:

If the data is more complicated, like this:

str = c("1548-(2), 1549))", "1401-(15~145), 153) ),) 156)")

you can use negative lookbehind:

str_replace_all(str, "(?<!\\([^)]{0,10})\\)\\)?", "")
[1] "1548-(2), 1549"          "1401-(15~145), 153, 156"

Here we remove one or two consecutive )s if there is no opening round parenthesis ( to the left at any distance between 0 and 10 characters (which must not include any )).

Upvotes: 1

akrun
akrun

Reputation: 887118

We could create a function to get the location of the parentheses and if there is any unbalanced one, then the length of one of them would be different, use that location in substr or str_sub to change it to ""

library(stringr)
f1 <- function(str1) {
   i1 <- str_locate_all(str1, '\\(')[[1]][,1]
   i2 <- str_locate_all(str1, '\\)')[[1]][,1]
   i3 <- seq_along(i2) > length(i1)
   if(any(i3)) {
      i4 <- i2[i3 ]
      str_sub(str1, i4, i4) <- ''
   }
   return(str1)
  }

-testing

f1(string1)
#[1] "1548-(2), 1549"
f1(string2)
#[1] "1401-(15~145), 153, 156"

f1("((a))")
#[1] "((a))"

f1("(a))")
#[1] "(a)"

Using the updated 'additional'

paste(sapply(strsplit(additional, "(?<=\\)),", perl = TRUE)[[1]], 
       f1), collapse=",")
#[1] "1543-(2, 6), 1548-(2), 1549, 1334-(1~5),  1401-(15, 145)"

Upvotes: 1

LC-datascientist
LC-datascientist

Reputation: 2096

Here is another solution using stringr package

fun <- function(x) {
  library(stringr)
  open_cnt = str_count(x, "\\(")                      # counts opening parentheses
  close_cnt = str_count(x, "\\)")                     # counts closing parentheses
  if(close_cnt > open_cnt){ 
    diff_i = (open_cnt+1):close_cnt                         # indexes differences
    diff_loc = str_locate_all(x, "\\)")[[1]][diff_i,1]      # locates differences
    paste0(unlist(str_split(x,""))[-diff_loc], collapse="") # rebuilds string
  } else {x}
}
fun(string1)
#[1] "1548-(2), 1549"
fun(string2)
#[1] "1401-(15~145), 153, 156"
fun("((( 0__0 ))))))))))")
#[1] "((( 0__0 )))"

Upvotes: 1

Related Questions