Reputation: 493
I have a dictionary in python
d = {tags[0]: value, tags[1]: value, tags[2]: value, tags[3]: value, tags[4]: value}
imagine that this dict is 10 times bigger, it has 50 keys and 50 values. Duplicates can be found in this tags but even then values are essential. How can I simply trimm it to recive new dict without duplicates of keys but with summ of values instead?
d = {'cat': 5, 'dog': 9, 'cat': 4, 'parrot': 6, 'cat': 6}
result
d = {'cat': 15, 'dog': 9, 'parrot': 6}
Upvotes: 3
Views: 9772
Reputation: 1110
This the perfect situation for using a Counter data structure. Let's take a look at what it does on few familiar data structures:
>>> from collections import Counter
>>> list_a = ["A", "A", "B", "C", "C", "A", "D"]
>>> list_b = ["B", "A", "B", "C", "C", "C", "D"]
>>> c1 = Counter(list_a)
>>> c2 = Counter(list_b)
>>> c1
Counter({'A': 3, 'C': 2, 'B': 1, 'D': 1})
>>> c2
Counter({'C': 3, 'B': 2, 'A': 1, 'D': 1})
>>> c1 - c2
Counter({'A': 2})
>>> c1 + c2
Counter({'C': 5, 'A': 4, 'B': 3, 'D': 2})
>>> c_diff = c1 - c2
>>> c_diff.update([77, 77, -99, 0, 0, 0])
>>> c_diff
Counter({0: 3, 'A': 2, 77: 2, -99: 1})
As you can see this behaves as a set
that keeps the count of element occurrences as a value.
However, the dictionary in itself is a set-like structure where for values we don't have to have numbers, so the things get more interesting:
>>> dic1 = {"A":"a", "B":"b"}
>>> cd = Counter(dic1)
>>> cd
Counter({'B': 'b', 'A': 'a'})
>>> cd.update(B='bB123')
>>> cd
Counter({'B': 'bbB123', 'A': 'a'})
>>> dic2 = {"A":[1,2], "B": ("a", 5)}
>>> cd2 = Counter(dic2)
>>> cd2
Counter({'B': ('a', 5), 'A': [1, 2]})
>>> cd2.update(A=[42], B=(2,2))
>>> cd2
Counter({'B': ('a', 5, 2, 2), 'A': [1, 2, 42, 42, 42, 42]})
>>> cd2 = Counter(dic2)
>>> cd2
Counter({'B': ('a', 5), 'A': [1, 2]})
>>> cd2.update(A=[42], B=("new elem",))
>>> cd2
Counter({'B': ('a', 5, 'new elem'), 'A': [1, 2, 42]})
As we can see the value we are adding/changing has to be of the same type in update
or it throws TypeError
.
For the situation we have in the question, we can just go with the flow
>>> d = {'cat': 5, 'dog': 9, 'cat': 4, 'parrot': 6, 'cat': 6}
>>> cd3 = Counter(d)
>>> cd3
Counter({'dog': 9, 'parrot': 6, 'cat': 6})
>>> cd3.update(parrot=123)
>>> cd3
Counter({'parrot': 129, 'dog': 9, 'cat': 6})
Upvotes: 2
Reputation: 33
If I understand correctly your question that you want to get rid of duplicate key data, use update function of dictionary while creating the dictionary. it will overwrite the data if the key is duplicate.
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k, v in tps:
result.update({k:v})
for k in result:
print "%s: %s" % (k, result[k])
Output will look like: dog: 9 parrot: 6 cat: 6
Upvotes: 0
Reputation: 6146
Instead of just doing dict of those things (can't have multiples of same key in a dict) I assume you can have them in a list of tuple pairs. Then it is just as easy as
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k,v in tps:
try:
result[k] += v
except KeyError:
result[k] = v
>>> result
{'dog': 9, 'parrot': 6, 'cat': 15}
changed mine to more explicit try-except handling. Alfe's is very concise though
Upvotes: 2
Reputation: 726
I'm not sure what you're trying to achieve, but the Counter class might be helpful for what you're trying to do: http://docs.python.org/dev/library/collections.html#collections.Counter
Upvotes: 1
Reputation: 86188
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
from collections import defaultdict
dicto = defaultdict(int)
for k,v in tps:
dicto[k] += v
Result:
>>> dicto
defaultdict(<type 'int'>, {'dog': 9, 'parrot': 6, 'cat': 15})
Upvotes: 6
Reputation: 15256
Perhapse what you really want is a tuple
of key-value pairs.
[('dog',1), ('cat',2), ('cat',3)]
Upvotes: 1
Reputation: 59436
I'd like to improve Paul Seeb's answer:
tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k, v in tps:
result[k] = result.get(k, 0) + v
Upvotes: 7
Reputation: 31
This option serves but is done with a list, or best can provide insight
data = []
for i, j in query.iteritems():
data.append(int(j))
try:
data.sort()
except TypeError:
del data
data_array = []
for x in data:
if x not in data_array:
data_array.append(x)
return data_array
Upvotes: 0