Reputation: 1243
I would like to find and replace expressions like "John and Jane Doe" with "John Doe and Jane Doe"
for a sample expression
regextest = 'Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown'
I can find the expression and replace it with a fixed string but I am not able to replace it with a modification of the original text.
re.sub(r'[a-zA-Z]+\s*and\s*[a-zA-Z]+.[^,]*',"kittens" ,regextest)
Output: 'Heather Robinson, kittens, kittens, Jimmy Nichols, Melanie Carbone, and Nancy Brown'
I think instead of a string ("kittens"), we can pass a function that can make that change but I am unable to write that function. I am getting errors below.
def re_couple_name_and(m):
return f'*{m.group(0).split()[0]+m.group(0).split()[-1:]+ m.group(0).split()[1:]}'
re.sub(r'[a-zA-Z]+\s*and\s*[a-zA-Z]+.[^,]*',re_couple_name_and ,regextest)
Upvotes: 0
Views: 105
Reputation: 2303
You can use below regex to capture the items to be interchanged and use re.sub()
to construct new string.
(\w+)( +and +)(\w+)( +[^,]*)
Example
import re
text="Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown"
print(re.sub(r"(\w+)( +and +)(\w+)( +[^,]*)",r"\1\4\2\3\4",text))
Output
Heather Robinson, Jane Smith and John Smith, Kiwan Brady John and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown
Upvotes: 1
Reputation: 29742
IIUC, one way using capture groups:
def re_couple_name_and(m):
family_name = m.group(3).split(" ",1)[1]
return "%s %s" % (m.group(1), family_name) + m.group(2) + m.group(3)
re.sub(r'([a-zA-Z]+)(\s*and\s*)([a-zA-Z]+.[^,]*)',re_couple_name_and ,regextest)
Output:
'Heather Robinson, Jane Smith and John Smith, Kiwan Brady John and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown'
Upvotes: 1