Reputation: 21
mylist = [(0.8132195134810816, 'A'), (0.79314903781799, 'B'), (0.3931539216409497, 'A'), (0.23487952756579994, 'B'), (0.06686513021322447, 'C'), (0.008103227303653366, 'C'), (0.007403104126575008, 'D'), (-0.0041128367759631496, 'D'), (-0.005739579154553378, 'D'), (-0.008074572907817046, 'B')]
#I've tried a few conversions. Note, I can do this with a for loop. I am looking to know if #there's a way to do this with a dictionary comprehension. Of course, I can build a regular #dictionary, but was hoping for a series of filter one-liners.
newdict = dict()
for symbol in ['A','B','C','D']: # semesters
values = [item for item, symbol_item in mylist if symbol_item == symbol]
print (symbol, sum(values)/len(values))
newdict[symbol] = sum(values)/len(values)
#I am hoping there is a way to do not list of the symbols
#Tried default dictionary to make value of key into a list, but that didn't work.
mydict = defaultdict(list)
mydict.update({key: (mydict[key] + [value]) for value,key in my list})
Upvotes: 2
Views: 42
Reputation: 96257
You can do this but it's always going to be ugly. In in Python 3.8+, you can use an assignment expression to assign to values:
>>> mylist = [(0.8132195134810816, 'A'), (0.79314903781799, 'B'), (0.3931539216409497, 'A'), (0.23487952756579994, 'B'), (0.06686513021322447, 'C'), (0.008103227303653366, 'C'), (0.007403104126575008, 'D'), (-0.0041128367759631496, 'D'), (-0.005739579154553378, 'D'), (-0.008074572907817046, 'B')]
>>> result = {
... symbol : sum((values:= [item for item, symbol_item in mylist if symbol_item == symbol])) / len(values)
... for symbol in ['A','B','C','D']
... }
>>> result
{'A': 0.6031867175610156, 'B': 0.33998466415865763, 'C': 0.03748417875843892, 'D': -0.0008164372679805064}
But this is really confusing an unreadable. You shouldn't be striving to cram your code into one-liners, that's bad. You should instead try to write readable, efficient, and maintainable code.
Comprehension constructs sometimes make your code more readable, that is their main advantage, if that isn't the case like here, then you shouldn't use it.
Note, without assignment expressions, you'd have to rely on another for-clause to assign to values
:
>>> result = {
... symbol : sum(values) / len(values)
... for symbol in ['A','B','C','D']
... for values in ([item for item, symbol_item in mylist if symbol_item == symbol],)
... }
>>> result
{'A': 0.6031867175610156, 'B': 0.33998466415865763, 'C': 0.03748417875843892, 'D': -0.0008164372679805064}
But really, that adds no clarity compared to a regular for-loop.
You could also iterate over:
[item for item, symbol_item in mylist if symbol_item == symbol]
Twice, once to get the sum, and again to get the length, but I won't even write out that insanity.
Now, the best way to do this IMO is to use the grouping idiom, and your code stays linear time, and you don't even need to know the symbols ahead of time:
>>> from collections import defaultdict
>>> result = defaultdict(list)
>>> for value, symbol in mylist:
... result[symbol].append(value)
...
>>> result = {symbol: sum(values)/len(values) for symbol, values in result.items()}
>>> result
{'A': 0.6031867175610156, 'B': 0.33998466415865763, 'C': 0.03748417875843892, 'D': -0.0008164372679805064}
Upvotes: 0
Reputation: 27515
You can use itertools.groupby
and statistics.mean
just make sure the input is sorted by the letters and here I used operator.itemgetter
to get the numbers and letters on the fly:
from itertools import groupby
from statistics import mean
from operator import itemgetter
mylist = [(0.8132195134810816, 'A'), (0.79314903781799, 'B'), (0.3931539216409497, 'A'), (0.23487952756579994, 'B'), (0.06686513021322447, 'C'), (0.008103227303653366, 'C'), (0.007403104126575008, 'D'), (-0.0041128367759631496, 'D'), (-0.005739579154553378, 'D'), (-0.008074572907817046, 'B')]
get_key = itemgetter(1)
get_value = itemgetter(0)
sorted_list = sorted(mylist, key=get_key)
newdict = {k: mean(map(get_value, g)) for k, g in groupby(sorted_list, get_key)}
print(newdict)
{'A': 0.6031867175610156, 'B': 0.33998466415865763, 'C': 0.03748417875843892, 'D': -0.0008164372679805064}
Upvotes: 2