Reputation: 67
I need to replace a string containing a substring with another string. For example:
biography -> biography
biographical -> biography
biopic -> biography
bio-pic -> biography-pic
I watched a biographical movie -> I watched a biography movie
Here, all words on the left contain bio
, so the whole word is replaced by biography
. I am aware of string.replace()
function, but it doesn't seem to work well here. I looked up regular expressions, but I'm not sure if re
is the right library to solve the problem.
Upvotes: 0
Views: 2709
Reputation: 1751
One of the decisions:
import re
def f(s, pat, replace):
pat = r'(\w*%s\w*)' % pat
return re.sub(pat, "biography", s)
input = """
biography -> biography
biographical -> biography
biopic -> biography
bio-pic -> biography-pic
I watched a biographical movie -> I watched a biography movie
"""
c = f(input, "bio", "biography")
print(c)
Output:
biography -> biography
biography -> biography
biography -> biography
biography-pic -> biography-pic
I watched a biography movie -> I watched a biography movie
Upvotes: 0
Reputation: 472
Try regular expression to solve this problem. It will definitely. You can change regular expression according to your requirement. Here is an example code
import re
s = "biography biographical biopic bio-pic I watched a biographical movie"
replaced = re.sub('(bio[A-Za-z]*)', 'biography', s)
print (replaced )
Upvotes: 0
Reputation: 624
import re
search_string = 'bio'
replace_string = 'biography'
vals = ['biography', 'biographical', 'biopic', 'bio-pic', 'something else', 'bio pic', 'I watched a biographical movie']
altered = [re.sub(re.escape(search_string)+r'\w*',replace_string,val) for val in vals]
print(altered)
outputs
['biography', 'biography', 'biography', 'biography-pic', 'something else', 'biography pic', 'I watched a biography movie']
For the regex part, re.escape()
can be used to turn a variable into a regular expression. I assumed your 'bio'
search string will not be constant. The rest of it \w*
means to match 0 or more (the *
) of the preceding character. \w
means word characters (a-z, A-Z, 0-9, and _). Since we're only matching word characters, it stops the match when a space is encountered.
Upvotes: 0
Reputation: 17156
Using Regex
import re
s = """
biography -> biography
biographical -> biography
biopic -> biography
bio-pic -> biography-pic
I watched a biographical movie -> I watched a biography movie
"""
x = re.sub(r'\b(bio\w*)', 'biography', s)
print(x)
Output
biography -> biography
biography -> biography
biography -> biography
biography-pic -> biography-pic
I watched a biography movie -> I watched a biography movie
Upvotes: 1