Reputation: 317
I am working on merging a few datasets regarding over 200 countries in the world. In cleaning the data I need to convert some three-letter codes for each country into the countries' full names.
The three-letter codes and country full names come from a separate CSV file, which shows a slightly different set of countries.
My question is: Is there a better way to write this?
str.replace("USA", "United States of America")
str.replace("CAN", "Canada")
str.replace("BHM", "Bahamas")
str.replace("CUB", "Cuba")
str.replace("HAI", "Haiti")
str.replace("DOM", "Dominican Republic")
str.replace("JAM", "Jamaica")
and so on. It goes on for another 200 rows. Thank you!
Upvotes: 0
Views: 137
Reputation: 4224
Since the number of substitution is high, I would instead iterate over the words in the string and replace based upon a dictionary lookup.
mapofcodes = {'USA': 'United States of America', ....}
for word in mystring.split():
finalstr += mapofcodes.get(word, word)
Upvotes: 1
Reputation: 1499
Here's a regular expressions solution:
import re
COUNTRIES = {'USA': 'United States of America', 'CAN': 'Canada'}
def repl(m):
country_code = m.group(1)
return COUNTRIES.get(country_code, country_code)
p = re.compile(r'([A-Z]{3})')
my_string = p.sub(repl, my_string)
Upvotes: 0
Reputation: 3996
Try reading the CSV file into a dictionary to a 2D array, you can access which ever one you want then.
that is if I understand your question correctly.
Upvotes: 0