Reputation: 121
I want to use the sub function to format the string "Ross McFluff: 0456-45324: 155 Elm Street\nRonald Heathmore: 5543-23464: 445 Finley Avenue".
For each person it should look like this:
Contact
Name: xx yy
Phone number: 0000-00000
Address: 000 zzz zzz
I tried to resolve the problem:
line = """Ross McFluff: 0456-45324: 155 Elm Street \nRonald Heathmore: 5543-23464: 445 Finley Avenue"""
match = re.sub(r':', r'', line)
rematch = re.sub(r'([A-Z][a-z]+\s[A-Z][a-zA-Z]+)(.*?)(\d\d\d\d-\d\d\d\d\d)', r'Contact. Name: \1. Phone number: \3. Address:\2', match)
I got something like this :
"Contact. Name: Ross McFluff. Phone number: 0456-45324. Address: 155 Elm Street \nContact. Name: Ronald Heathmore. Phone number: 5543-23464. Address: 445 Finley Avenue"
How can i do to get this result :
Contact
Name: Ross McFluff
Phone number: 0456-45324
Address: 155 Elm Street
Contact
Name: Ronald Heathmore
Phone number: 5543-23464
Address: 445 Finley Avenue
Any idea? thanks /Georges
Upvotes: 0
Views: 44
Reputation: 5274
I would toss a split in there like this:
import re
data = """Ross McFluff: 0456-45324: 155 Elm Street \nRonald Heathmore: 5543-23464: 445 Finley Avenue"""
linelist = data.split("\n")
for theline in linelist:
rematch = re.sub('([^:]+): ([^:]+): (.*)', r'Contact\nName: \1\nPhone Number: \2\nAddress: \3', theline)
print (rematch)
results:
Contact
Name: Ross McFluff
Phone Number: 0456-45324
Address: 155 Elm Street
Contact
Name: Ronald Heathmore
Phone Number: 5543-23464
Address: 445 Finley Avenue
That way you can easily process each "line". I really like using stuff like:
([^:]+)
That's a negative character class, it matches NOT what is in the class since that's really what you are doing. I suppose you could also just do splits on the colons, but you may want more control by using a regex like this. You may have to play around with using trim to make sure all the whitespaces are cleaned up, really depends what you are doing with the data.
If you need to go with a pure regex solution, it can be done by just fiddling around on here: https://regex101.com/
Upvotes: 1
Reputation: 427
I tend to prefer the size specifier when I can, and I am not sure how your first response came back correctly, I am assuming that is just a weird anomaly, but below is a query that should work. Your values will be \1, \3, and \5 For name number and address. This should work in reading the address to the end of your string. (I use a generic parser for testing)
([A-Z][a-z]+\s[A-Z][a-zA-Z]+)(.*?)(\d{4}-\d{5})(.*?)([\w+ ]+)
Upvotes: 0