Reputation: 2689
My dictionary looks like this:
docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
...
29:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
}
I want to sum the values of each inner dict for each list and store it in a new dict, with the new dict having key values of {0:2.3+8.7+4.1+1.7, 1:1.2+5.2+3.8+0.8, ... etc}
i.e.
for x in docScores[0]: #{0:
for x in docScores[0][0].values(): #{,2.3}.
sum = sum+x #where sum = 0 before loop
docSum[0] = sum
repeat this loop for every document
Any variation that I have tried is giving me unexpected outputs. Can anyone give me the correct syntax for this?
Upvotes: 2
Views: 6087
Reputation: 103744
This dict comprehension works:
docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
29:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
}
sum_d={k:sum(d.values()[0] for d in v) for k,v in docScores.items()}
print sum_d
Prints:
{0: 16.8, 1: 11.0, 29: 9.6}
However, changing your data structure may be easier. You could have a dict of dicts:
>>> NdocScores = {0:{u'word':2.3,u'the':8.7,u'if':4.1,u'Car':1.7},
... 1:{u'friend':1.2,u'a':5.2,u'you':3.8,u'person':0.8},
... 29:{u'yard':1.5,u'gardening':2.8,u'paint':3.7,u'brush':1.6}
... }
Which allows each docs data to be directly accessed:
>>> NdocScores[0]
{u'Car': 1.7, u'the': 8.7, u'word': 2.3, u'if': 4.1}
>>> NdocScores[0][u'Car']
1.7
>>> sum(NdocScores[1].values())
11.0
>>> NdocScores[29]
{u'gardening': 2.8, u'yard': 1.5, u'brush': 1.6, u'paint': 3.7}
Or, just have a list of dicts with the position in the list corresponding to the doc index:
>>> lofdicts=[v for k,v in NdocScores.items()]
>>> lofdicts
[{u'Car': 1.7, u'the': 8.7, u'word': 2.3, u'if': 4.1}, {u'a': 5.2, u'person': 0.8, u'you': 3.8, u'friend': 1.2}, {u'gardening': 2.8, u'yard': 1.5, u'brush': 1.6, u'paint': 3.7}]
>>> lofdicts[0]
{u'Car': 1.7, u'the': 8.7, u'word': 2.3, u'if': 4.1}
>>> sum(lofdicts[1].values())
11.0
Upvotes: 3
Reputation: 309841
new_dict={}
docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
29:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
}
for k,v in docScores.items():
new_dict[k]=sum( sum(d.values()) for d in v )
print (new_dict) #{0: 16.8, 1: 11.0, 29: 9.6}
As others have mentioned, you could make this into a dictionary comprehension (python 2.7+):
new_dict = {k : sum( sum(d.values()) for d in v ) for k,v in docScores.items() }
But at this point I think that the comprehension is getting very difficult to comprehend (and therefore I wouldn't do it).
Also, someone should probably point out that if all your dictionary keys are sequential integers starting from 0 and going to 29, You probably shouldn't be using a dictionary to store this data -- maybe a list would be more appropriate ...
EDIT
using a list:
new_list = [sum( sum(d.values()) for d in v ) for _,v in sorted(docScores.items()) ]
Upvotes: 2
Reputation: 110208
docScores = {0:[{u'word':2.3},{u'the':8.7},{u'if':4.1},{u'Car':1.7}],
1:[{u'friend':1.2},{u'a':5.2},{u'you':3.8},{u'person':0.8}],
2:[{u'yard':1.5},{u'gardening':2.8},{u'paint':3.7},{u'brush':1.6}]
}
result = dict(enumerate(sum (sum(word.values()) for word in word_list[1]) for word_list in sorted(docScores.items()) ) )
Upvotes: 0
Reputation: 133514
>>> doc_scores = {
0: [{u'word': 2.3}, {u'the': 8.7}, {u'if': 4.1}, {u'Car': 1.7}],
1: [{u'friend': 1.2}, {u'a': 5.2}, {u'you': 3.8}, {u'person': 0.8}],
29: [{u'yard': 1.5}, {u'gardening': 2.8}, {u'paint': 3.7}, {u'brush': 1.6}]
}
>>> dict((k, sum(n for d in v for n in d.itervalues()))
for k, v in doc_scores.iteritems())
{0: 16.8, 1: 11.0, 29: 9.6}
If you only have one value in each of the dicts in the lists you can shorten this:
>>> dict((k, sum(d.values()[0] for d in v)) for k, v in doc_scores.iteritems())
{0: 16.8, 1: 11.0, 29: 9.6}
Upvotes: 1
Reputation: 7343
And more oneline solve )
sum(reduce(lambda x, y: x+y, [d.values() for d in v for _,v in docScores.iteritems()]))
Upvotes: 1