Dennis
Dennis

Reputation: 124

Is there an easy way to remove unnecessary whitespaces inside of brackets that are in the middle of a string in Python?

I've strings in the form of:

s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."

and I would like to get a cleaned string in the form of:

s = "Wow that is really nice, (2.1) shows that according to the drawings in (1.1) and a) there are errors."

I tried to fix it with regex:

import re

regex = r" (?=[^(]*\))"
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are some errors."
re.sub(regex, "", s)

But I get faulty results like this: Wow that is really nice, (2.1) shows that according to the drawings in (1.1)anda) there are some errors.

Does anyone know how to deal with this problem when you don't always have the same number of opening and closing brackets?

Upvotes: 0

Views: 95

Answers (5)

The fourth bird
The fourth bird

Reputation: 163362

If you also want to match balanced parenthesis and remove the spaces, you can make use of the PyPi regex module and a recursive pattern

\([^)(]*+(?:(?R)[^)(]*)*+\)

See a regex demo.

Note that it will remove all spaces.

import regex

pattern = r"\([^)(]*+(?:(?R)[^)(]*)*+\)"

s = ("Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors.\n"
"Wow that is really nice, ( 2.1 (2.1 ( 1,3 ) ) )shows that according to the drawings in ( 1. 1) and a) there are errors.")

print(regex.sub(pattern, lambda m: m[0].replace(" ", ""), s))

Output

Wow that is really nice, (2.1) shows that according to the drawings in (1.1) and a) there are errors.
Wow that is really nice, (2.1(2.1(1,3)))shows that according to the drawings in (1.1) and a) there are errors.

To only remove the spaces after the ( and before the )

import regex

pattern = r"\([^)(]*+(?:(?R)[^)(]*)*+\)"

s = "Wow that is really nice, ( test in 2.1 (2.1 test( 1,3 test ) ) )shows that according to the drawings in ( 1. 1) and a) there are errors."

print(regex.sub(pattern, lambda m: regex.sub(r"(?<=\() +| +(?=\))", "", m[0]), s))

Output

Wow that is really nice, (test in 2.1 (2.1 test(1,3 test)))shows that according to the drawings in (1. 1) and a) there are errors.

Upvotes: 0

Sanjay Manohar
Sanjay Manohar

Reputation: 7026

try

 r" (?=[^()]*\))"

This excludes 'close parenthesis' from the things that can be inside a pair of parentheses.

Whether this works will depends whether you have nested brackets in your text.

Nested brackets is not something that can be solved with regex- you need a parser (it may need to count the brackets)

Upvotes: 0

matszwecja
matszwecja

Reputation: 7970

You can match all the inner-most parentheneses with simple regex, and then perform a substitution on the matches to remove all the whitespaces.

import re
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."
regex = r"\([^\(\)]*\)"
res = re.sub(regex, lambda s: s[0].replace(" ", ""), s)

print(res)

Upvotes: 1

Unknown _
Unknown _

Reputation: 17

I am not sure about that, but you can try to do the following:

s = s.replace('( ','(')
s = s.replace(' )',')')

Here replace(old, new) is standard function, that replace old string to the new one. I hope it will help.

Upvotes: 2

user3468054
user3468054

Reputation: 610

If the only whitespace you want to remove are the ones that occur directly after an opening bracket (or before a closing), then a simple string replace might work:

>>> s.replace("( ", "(").replace(" )", ")")
'Wow that is really nice, (2.1) shows that according to the drawings in (1. 1) and a) there are errors.'

Upvotes: 1

Related Questions