Reputation: 505
change all city name into one universal name.
City b c
0 New york 1 1
1 New York 2 2
2 N.Y. 3 3
3 NY 4 4
They call refer to the city New york however python sees them as separate entity therefore I've changed all into one.
df["City"] = df["City"].replace({"N.Y.":"New york", "New York": "New york", "NY": "New york"})
After this I need to check if all variation of new york is covered, to do that I've created a function
def universal_ok(universal_name):
count = 0
for c in df.City:
if c == universal_name:
count += 1
# This only works when column consists of only one type of city
if count == len(df.City):
return "Yes all names are formatted correctly"
else:
return f"there are {len(df.City) - count} names that need to be changed"
universal_ok("New york")
but the problem is what about when there are more than one city in a column
City b c
0 New york 1 1
1 New York 2 2
2 N.Y. 3 3
3 NY 4 4
4 Toronto 3 2
5 TO 3 2
6 toronto 3 2
is there a way to change each city to universal name?
Upvotes: 0
Views: 742
Reputation: 62513
City b c
New york 1 1
New York 2 2
N.Y. 3 3
NY 4 4
Toronto 3 2
TO 3 2
toronto 3 2
pandas.Series.str.lower
df.City = df.City.str.lower()
City b c
new york 1 1
new york 2 2
n.y. 3 3
ny 4 4
toronto 3 2
to 3 2
toronto 3 2
pandas.Series.unique
df.City.unique()
array(['new york', 'n.y.', 'ny', 'toronto', 'to'], dtype=object)
pandas.Series.map
cities_tup = (('New York', ['ny', 'n.y.', 'new york']),
('Toronto', ['toronto', 'to']))
cities_map = {y:x[0] for x in cities_tup for y in x[1]}
{'ny': 'New York',
'n.y.': 'New York',
'new york': 'New York',
'toronto': 'Toronto',
'to': 'Toronto'}
df.City = df.City.map(cities_map)
City b c
New York 1 1
New York 2 2
New York 3 3
New York 4 4
Toronto 3 2
Toronto 3 2
Toronto 3 2
pandas.Series.value_counts
df.City.value_counts()
New York 4
Toronto 3
Name: City, dtype: int64
Upvotes: 1
Reputation: 67879
You need a specific column with some sort of city id, otherwise you won’t be able to distinguish between Paris, France and Paris, Texas, nor will you be able to group Istanbul and Constantinople.
Upvotes: 0