Reputation: 119
I am doing a problem from Automate the Boring Stuff, trying to imitate the strip() method using regex. I have pretty much figured it out, works with whitespace and a specific word I want removed. But when removing a specific keyword from the end of a string, it always cuts the last letter of the string off, can anyone help me figure out why?
def strip_func(string, *args):
strip_regex = re.compile(r'^(\s+)(.*?)(\s+)$')
mo = strip_regex.findall(string)
if not mo:
rem = args[0]
remove_regex = re.compile(rf'({rem})+(.*)[^{rem}]')
remove_mo = remove_regex.findall(string)
print(remove_mo[0][1])
else:
print(mo[0][1])
So if no second argument is passed then the function deletes whitespace from either side of the string, I used this string to test that:
s = ' This is a string with whitespace on either side '
Otherwise it deletes the keyword, kind of like the strip function. Eg:
spam = 'SpamSpamBaconSpamEggsSpamSpam'
strip_func(spam, 'Spam')
Output:
BaconSpamEgg
So missing the 's' at the end of Eggs, same thing happens with every string I try. Thanks in advance for the help.
Upvotes: 2
Views: 134
Reputation: 626950
You may use
import re
def strip_func(string, *args):
return re.sub(rf'^(?:{re.escape(args[0])})+(.*?)(?:{re.escape(args[0])})+$', r'\1', string, flags=re.S)
spam = 'SpamSpamBaconSpamEggsSpamSpam'
print(strip_func(spam, 'Spam'))
See the Python demo. The ^(?:{re.escape(args[0])})+(.*?)(?:{re.escape(args[0])})+$
pattern will create a pattern like ^(?:Spam)+(.*?)(?:Spam)+$
and will match
^
- start of string(?:Spam)+
- one or more occurrences of Spam
at the start of the string(.*?)
- Group 1: any 0 or more chars as few as possible(?:Spam)+
- one or more occurrences of Spam
at the start of the string$
- end of string.The flags=re.S
will make .
match line break chars, too.
Upvotes: 2