Reputation: 2177
I have a string that contains a lot of noise and I only want to remove the symbols on left and right part of my string.
|«_ Date: 23/12/18 16:41 ($123) :}‘'
With my current approach I can remove all of them but it includes the symbols at centre as well, which I do not intend to do.
re.sub(r"[^a-zA-z0-9,./$ ' ' -]", "", s)
My ideal result would be like this
Date: 23/12/18 16:41 ($123)
Upvotes: 0
Views: 1291
Reputation: 2177
The solution from @potato works great for my use case.
The relevant regex command would be
Solution
re.sub(r"(?i)^[^a-z\d()]*|[^a-z\d()]+$", "", s)
Upvotes: 0
Reputation: 1842
Here's a very rough pattern for now, with quite a lot of caveats. It satisfies your example string above but I can foresee that you may get a lot of false matches. It's difficult to refine further since I do not know what the string structure looks like.
^(.*?)(?:[0-9a-zA-Z].*?\))(.*?)$
The pattern above captures all leading characters as group 1 and all trailing characters after your closing parenthesis as group 2. However, if there's another pair of parenthesis within the valid string, before the ending (123)
it'll mess it up. Please see the example below.
Example: https://regex101.com/r/JikTHo/1
Upvotes: 1