MatGut
MatGut

Reputation: 35

find substring from list - python

I have a list with elements I would like to remove from a string:

Example

list = ['345','DEF', 'QWERTY']
my_string = '12345XYZDEFABCQWERTY'

Is there a way to iterate list and find where are the elements in the string? My final objective is to remove those elements from the string (I don't know if is this the proper way, since strings are immutable)

Upvotes: 0

Views: 126

Answers (1)

Eric Duminil
Eric Duminil

Reputation: 54213

You could use a regex union :

import re

def delete_substrings_from_string(substrings, text):
    pattern = re.compile('|'.join(map(re.escape, substrings)))
    return re.sub(pattern, '', text)

print(delete_substrings_from_string(['345', 'DEF', 'QWERTY'], '12345XYZDEFABCQWERTY'))
# 12XYZABC
print(delete_substrings_from_string(['AA', 'ZZ'], 'ZAAZ'))
# ZZ

It uses re.escape to avoid interpreting the string content as a literal regex.

It uses only one pass so it should be reasonably fast and it ensures that the second example isn't converted to an empty string.

If you want a faster solution, you could build a Trie-based regex out of your substrings.

Upvotes: 2

Related Questions