eggman
eggman

Reputation: 423

Python regex, how to delete all matches from a string

I have a list of regex patterns.

rgx_list = ['pattern_1', 'pattern_2', 'pattern_3']

And I am using a function to loop through the list, compile the regex's, and apply a findall to grab the matched terms and then I would like a way of deleting said terms from the text.

def clean_text(rgx_list, text):
    matches = []
    for r in rgx_list:
        rgx = re.compile(r)
        found_matches = re.findall(rgx, text)
        matches.append(found_matches)

I want to do something like text.delete(matches) so that all of the matches will be deleted from the text and then I can return the cleansed text.

Does anyone know how to do this? My current code will only work for one match of each pattern, but the text may have more than one occurence of the same pattern and I would like to eliminate all matches.

Upvotes: 26

Views: 40250

Answers (2)

Matt S
Matt S

Reputation: 15374

Use sub to replace matched patterns with an empty string. No need to separately find the matches first.

def clean_text(rgx_list, text):
    new_text = text
    for rgx_match in rgx_list:
        new_text = re.sub(rgx_match, '', new_text)
    return new_text

Upvotes: 37

fleaheap
fleaheap

Reputation: 166

For simple regex you can OR the expressions together using a "|". There are examples of combining regex using OR on stack overflow.

For really complex regex I would loop through the list of regex. You could get timeouts from combined complex regex.

Upvotes: 0

Related Questions