Reputation: 922
I am trying to find the averages for the values of a dictionary by city. For the purposes of this exercise I cannot use numpy or pandas.
Here is some example data:
d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3)
}
Here is the ideal output:
city_averages = {
'Chicago': 48.4,
'Dallas': 70.8,
'Paris': 139.7
}
Here is the code I tried.
city_averages = {}
total = 0
for k,v in d.items():
total += float(v)
city_averages[k[0]] = total
Upvotes: 2
Views: 166
Reputation: 8508
You can do something more simple like this:
d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
('Paris', 2011): 100.4
}
dnew = {}
for k,v in d.items():
if k[0] in dnew:
dnew[k[0]] += v
else:
dnew[k[0]] = v
print (dnew)
you will get an output as follows:
{'Chicago': 96.80, 'Dallas': 70.8, 'Paris': 169.3}
You will need to format the data before you print them.
I will leave you to figure out the logic for finding the average. This should help you get closer to the full answer.
answer with average calculation:
Here's the code that includes calculation for average. This does not use any complicated logic.
dnew = {}
dcnt = {}
for k,v in d.items():
dnew[k[0]] = dnew.get(k[0], 0) + v
dcnt[k[0]] = dcnt.get(k[0], 0) + 1
for k,v in dnew.items():
dnew[k] /= dcnt[k]
print (dnew)
The output will be as follows:
{'Chicago': 48.400000000000006, 'Dallas': 70.8, 'Paris': 56.43333333333334}
Upvotes: 2
Reputation: 16747
Next I provide two versions of one-liner codes - first simple one using itertools.groupby
and second more complex without usage of any extra modules.
import itertools
d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
}
print({k : sum(e[1] for e in lg) / len(lg) for k, g in itertools.groupby(sorted(d.items()), lambda e: e[0][0]) for lg in (list(g),)})
Next fancy one-liner code I've created without using any modules (like itertools), just plain python, it is as efficient in terms of time complexity as code above with itertools.groupby. This code is just for recreational purpose or when you really need one-liner without usage of any modules:
d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3,
}
print({k : sm / cnt for sd in (sorted(d.items()),) for i in range(len(sd)) for k, cnt, sm in ((sd[i][0][0] if i + 1 >= len(sd) or sd[i][0][0] != sd[i + 1][0][0] else None,) + ((1, sd[i][1]) if i == 0 or sd[i - 1][0][0] != sd[i][0][0] else (cnt + 1, sm + sd[i][1])),) if k is not None})
Upvotes: 0
Reputation: 324
There is very similar question on here
In your case, the code is as following:
from collections import defaultdict
import statistics
d = {
('Chicago', 2006): 23.4,
('Chicago', 2007): 73.4,
('Dallas', 2008): 70.8,
('Paris', 2010): 5.6,
('Paris', 2011): 63.3
}
grouper = defaultdict(list)
for k, v in d.items():
grouper[k[0]].append(v)
city_averages = {k: statistics.mean(v) for k,v in grouper.items()}
print(city_averages)
Upvotes: 2