Reputation: 147
I am trying to replace a regular expression match with modified regular expression. Following is the column in my DataFrame.
df['newcolumn']
0 Ther was a quick brown appl_product_type in ("eds") where blah blan appl_Cust_type =("value","value")
1 Ther was a quick brown appl_product_type = ("EDS") where blah blan appl_Cust_type =("value","value")
2 Ther was a quick brown appl_product_type in ("eds") where blah b
3 Ther was a quick brown appl_product_type in = ("EDS") where blah blan appl_Cust_type = ("value")
4 Ther was a quick brown where blah blan appl_Cust_type
Name: newcolumn, dtype: object
i want to replace every occurrence of strings like "appl_product_type = ('EDS')' to 'upper(appl_product_type) = ('EDS')'
i am using following code but getting error
newcolumn.replace(value='upper\[\w]+\s+[in=]+[\s+\([\"\w+\,+\s+]+\)', regex='[\w]+\s+[in=]+[\s+\([\"\w+\,+\s+]+\)')
error: bad escape \w at position 7
is there a way to solve this ?? Please Help.
Upvotes: 0
Views: 269
Reputation: 2802
A couple of things -
r''
to make simpler regex stringsI have a slightly more clear solution to what you have attempted, but am unsure if this is exactly what you wanted given the ambiguity in you question. -
df['newcolumn'] = df['newcolumn'].replace({r'([\w_]+\s+(?:in|=|\s)+\(\"(?:\w+\"(?:\,)?(?:\s+)?)+\))' : r'upper(\1)'}, regex=True)
Upvotes: 1