Rex Low
Rex Low

Reputation: 2177

Regex remove symbols only on left and right side of string

I have a string that contains a lot of noise and I only want to remove the symbols on left and right part of my string.

|«_ Date: 23/12/18 16:41 ($123) :}‘'

With my current approach I can remove all of them but it includes the symbols at centre as well, which I do not intend to do.

re.sub(r"[^a-zA-z0-9,./$ ' ' -]", "", s)

My ideal result would be like this

Date: 23/12/18 16:41 ($123)

Upvotes: 0

Views: 1291

Answers (2)

Rex Low
Rex Low

Reputation: 2177

The solution from @potato works great for my use case.

The relevant regex command would be

Solution

re.sub(r"(?i)^[^a-z\d()]*|[^a-z\d()]+$", "", s)

Upvotes: 0

kerwei
kerwei

Reputation: 1842

Here's a very rough pattern for now, with quite a lot of caveats. It satisfies your example string above but I can foresee that you may get a lot of false matches. It's difficult to refine further since I do not know what the string structure looks like.

^(.*?)(?:[0-9a-zA-Z].*?\))(.*?)$

The pattern above captures all leading characters as group 1 and all trailing characters after your closing parenthesis as group 2. However, if there's another pair of parenthesis within the valid string, before the ending (123) it'll mess it up. Please see the example below.

Example: https://regex101.com/r/JikTHo/1

Upvotes: 1

Related Questions