Kjobber
Kjobber

Reputation: 184

Regex to replace complete string containing a given substring

I have the following string,

str = '''
```
echo hello
```
THis works just fine . I love it 
![](/media/k-products/bf57d173-c1d7-44c4-b1fe-400c5f7e4e8c/35b17c4b-7f41-4552-8f9b-e883ccdc2c00.png)


| Column 1 | Column 2 | Column 3 |
| -------- | -------- | -------- |
| Text     | Text      | ![](/media/k-products/bf57d173-c1d7-44c4-b1fe-400c5f7e4e8c/35b17c4b-7f41-4552-8f9b-e883ccdc2c00.png)    |
ss

Hello world!

living on love
'''

Given the following substring: 35b17c4b-7f41-4552-8f9b-e883ccdc2c00 I would like to replace all parent strings containing the substring 35b17c4b-7f41-4552-8f9b-e883ccdc2c00.

This implies the results should end up being something of this form:

str = '''
```
echo hello
```
THis works just fine . I love it 


| Column 1 | Column 2 | Column 3 |
| -------- | -------- | -------- |
| Text     | Text      |    |
ss

Hello world!

living on love
'''

So the substring removal or replacement should remove the complete line of string ![](/media/k-products/bf57d173-c1d7-44c4-b1fe-400c5f7e4e8c/35b17c4b-7f41-4552-8f9b-e883ccdc2c00.png) containing the substring 35b17c4b-7f41-4552-8f9b-e883ccdc2c00

print(re.sub(r'\b(35b17c4b-7f41-4552-8f9b-e883ccdc2c00\w*)', ' ', str)) 

Upvotes: 2

Views: 72

Answers (1)

Woody1193
Woody1193

Reputation: 7980

The issue with your regex is that you're not testing for things coming before or after the value you want:

re.sub(r'.*35b17c4b\-7f41\-4552\-8f9b\-e883ccdc2c00.*', ' ', str)

In regular expressions, . means any single character and * means "any of". So, this code takes any string containing 35b17c4b-7f41-4552-8f9b-e883ccdc2c00 and replaces it with ' '. You don't have to worry about the newlines as newlines are not included in ..

However, this doesn't take into account your second usecase, which does not remove the entire string, but just the values between | characters. So, we need to modify this regex to handle this case:

re.sub(r'[^|\s]*35b17c4b\-7f41\-4552\-8f9b\-e883ccdc2c00[^|\s]*', ' ', str)

This regular expression modifies the previous one by replacing .* with [^|\s]*. Instead of just search for any character, this string searches for any character which is not a | or whitespace. We have to include the \s because ^| includes whitespace and newlines, which you want excluded for your usecase.

Upvotes: 2

Related Questions