Reputation: 107
I have a list of names
that have the title Dr.
in the wrong place.
Therefore i would like to
Dr.,
or Dr.
with
Dr.
to the start of the corresponding strings.my result is rather disappointing. Is re.sub()
even the right approach?
names = ['Johnson, Dr., PWE', 'Peterson, FDR', 'Gaber, Dr. GTZ']
for idx, item in enumerate(names):
names[idx] = re.sub(r' Dr.(,)? ', ' Dr. ', item)
print(names)
['Johnson, Dr. PWE', 'Peterson, FDR', 'Gaber, Dr. GTZ']
desired_names = ['Dr. Johnson, PWE', 'Peterson, FDR', 'Dr. Gaber, GTZ']
Upvotes: 0
Views: 80
Reputation: 163467
You can use 2 capture groups, and use those reverted in the replacement to get the right order.
([^,\n]+,\s*)(Dr\.),?\s*
([^,\n]+,\s*)
Capture any char except ,
or a newline in group 1, then match a comma and optional whitespace char(Dr\.)
Capture Dr.
in group 2,?\s*
Match an optional comma and whitespace charsExample
import re
names = ['Johnson, Dr., PWE', 'Peterson, FDR', 'Gaber, Dr. GTZ']
for idx, item in enumerate(names):
names[idx] = re.sub(r'([^,\n]+,\s*)(Dr\.),?\s*', r'\2 \1', item)
print(names)
Output
['Dr. Johnson, PWE', 'Peterson, FDR', 'Dr. Gaber, GTZ']
Upvotes: 1