Reputation: 99
I was trying to figure this out on my own, but am now getting frustrated so wanted to reach out to StackXers. I am a beginner Python developer learning regular expressions using the Automate the Boring Stuff udemy course.
As for my problem, I am trying to use regular expressions create this target string:
target_string = '12 drummers drumming, 11 pipers piping, 10 lords a leaping,
9 ladies dancing, 8 maids a milking, 7 swans a swimming, 6 geese a laying, 5
golden rings, 4 calling birds, 3 french hens, 2 turtle doves, and a
partridge in a pear tree'
The original string (copied from metrolyrics) is
original_string = '''12 Drummers Drumming 11 Pipers Piping 10 Lords a
Leaping 9 Ladies Dancing 8 Maids a Milking 7 Swans a Swimming 6 Geese a
Laying 5 Golden Rings 4 Calling Birds 3 French Hens 2 Turtle Doves and a
Partridge in a Pear Tree'''
My code is as follows
import re
strings = '''12 Drummers Drumming 11 Pipers Piping 10 Lords a Leaping 9
Ladies Dancing 8 Maids a Milking 7 Swans a Swimming 6 Geese a Laying 5
Golden Rings 4 Calling Birds 3 French Hens 2 Turtle Doves and a Partridge in
a Pear Tree'''
lyrics = strings.split()
xmasRegex = re.compile(r'\d+\s\D+\s([a-zA-Z]+)')
re.sub(r'\1,',strings)
This only returns the rhymed words (with the unintended inclusion of "Tree" and exclusion of "Doves") with commas at the end, but I am attempting to replace these words (including "Doves") and put them back in the string using this method as seen in the target string. Although it is possible to do this with a for loop and some tinkering, I wanted to do it the regex way.
What am I doing wrong with the sub method and/or regex object?
Upvotes: 1
Views: 85
Reputation: 698
This reproduces the whole target string including the comma before the and
in one pass.
In [34]: target_string
Out[34]: '12 drummers drumming, 11 pipers piping, 10 lords a leaping, 9 ladies dancing, 8 maids a milking, 7 swans a swimming, 6 geese a laying, 5 golden rings, 4 calling birds, 3 french hens, 2 turtle doves, and a partridge in a pear tree'
In [35]: original_strings
Out[35]: '12 Drummers Drumming 11 Pipers Piping 10 Lords a Leaping 9 Ladies Dancing 8 Maids a Milking 7 Swans a Swimming 6 Geese a Laying 5 Golden Rings 4 Calling Birds 3 French Hens 2 Turtle Doves and a Partridge in a Pear Tree'
In [36]: replaced_strings = re.sub('(\s\d+|\sand)',r',\1',original_strings).lower()
In [37]: target_string == replaced_strings
Out[37]: True
Upvotes: 2
Reputation: 12456
You can do it in 2 passes:
1)
use this regex to detect numbers r'((\s|^)\d+)'
preceded by nothing or a space and replace it using backreference to the first matching group ',\1'
tested on https://regex101.com/r/7yxSaj/1/
2)
use this regex to detect the first uppercase letter of words and convert it to lower case: r'\b([A-Z])'
and replacement string: '\L\1'
tested on https://regex101.com/r/dNuYhG/1/
Upvotes: 1