Reputation: 51
I am extremely new to python. I often get text files that have phone numbers is various formats. I am trying to create a python script that takes this text file and normalizes them into a format I can use.
I am trying to remove all symbols and spaces and just leave the numbers. As well as add +1
to the beginning and a comma (,
) at the end.
import re
with open("test_numbers.txt") as file:
dirty = file.read()
clean = re.sub(r'[^0-9]', '', dirty)
print clean
I'm trying to use regex but it puts everything on a single line. Maybe I am going about this all wrong. I have not worked out a way to add the +1
to the beginning of the number or add a comma at the end. Would appreciate any advice.
Upvotes: 5
Views: 1907
Reputation: 1870
This might help you:
import re
with open('test_numbers.txt') as f:
dirty = f.readlines()
clean = []
for l in dirty:
clean.apped('+1{},\n'.format(re.sub(r'[^0-9]', '', l)))
clean
will be a list of lines with +1
at the beginning and ,
at the end. You may then save it to a text file with:
with open('formatted_numbers.txt', 'w') as f:
f.writelines(clean)
You can also use a one liner using list comprehension:
clean = ['+1{},\n'.format(re.sub(r'[^0-9]', '', l)) for l in dirty]
Upvotes: 2