Reputation: 8582
I am trying to remove a string that is in parentheses from a list in Python without success.
See following code:
full = ['webb', 'ellis', '(sportswear)']
regex = re.compile(r'\b\(.*\)\b')
filtered = [i for i in full if not regex.search(i)]
Returns:
['webb', 'ellis', '(sportswear)']
Could somebody point out my mistake?
Upvotes: 13
Views: 32715
Reputation: 499
For my use case, this worked. Maybe it would be useful for someone finding the same problem
doc_list = dir(obj)
regex = re.compile(r'^__\w*__$')
filtered = [ele for ele in doc_list if not regex.match(ele)]
Upvotes: 3
Reputation: 1862
>>> import re
>>> full = ['webb', 'ellis', '(sportswear)']
>>> x = filter(None, [re.sub(r".*\(.*\).*", r"", i) for i in full])
>>> x
['webb', 'ellis']
Upvotes: 3
Reputation: 627488
The \b
word boundary makes it impossible to match (
at the beginning of a string since there is no word there (i.e. \b
requires a letter, digit or underscore to be right before (
in your pattern, and that is not the case).
As you confirm you need to match values that are fully enclosed with (...)
, you need regex = re.compile(r'\(.*\)$')
with re.match
.
Use
import re
full = ['webb', 'ellis', '(sportswear)']
regex = re.compile(r'\(.*\)$')
filtered = [i for i in full if not regex.match(i)]
print(filtered)
See the IDEONE demo
The re.match
will anchor the match at the start of the string, and the $
will anchor the match at the end of the string.
Note that if your string has newlines in it, use flags=re.DOTALL
when compiling the regex (so that .
could also match newline symbols, too).
Upvotes: 12