Reputation: 309
it seems a simple task:
I am trying to merge 2 dictionaries without overwriting the values but APPENDING.
a = {1: [(1,1)],2: [(2,2),(3,3)],3: [(4,4)]}
b = {3: [(5,5)], 4: [(6,6)]}
number of tuples a = 4, number of tuples b = 2
This is why I have singled out these options since they are overwriting:
all = dict(a.items() + b.items())
all = dict(a, **b)
all = a.update([b])
The following solution works just fine, BUT it also appends values to my original dictionary a:
all = {}
for k in a.keys():
if k in all:
all[k].append(a[k])
else:
all[k] = a[k]
for k in b.keys():
if k in all:
all[k].append(b[k])
else:
all[k] = b[k]
Output =
a = {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), **[(5, 5)]**]}
b = {3: [(5, 5)], 4: [(6, 6)]}
c = {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), [(5, 5)]], 4: [(6, 6)]}
number of tuples a = 5 !!!!!, number of tuples b = 2 (correct), number of tuples all = 6 (correct)
It appended tuple [(5,5)]
from b to a. I have no idea as to why this happens because all I am coding is to write everything into the complete dictionary "all".
Can anyone tell me where the heck it is changing dict(a) ???????
Any help is greatly welcome.
Upvotes: 2
Views: 7546
Reputation: 13088
If you want a third dictionary that is the combined one I would use the collection.defaultdict
from collections import defaultdict
from itertools import chain
all = defaultdict(list)
for k,v in chain(a.iteritems(), b.iteritems()):
all[k].extend(v)
outputs
defaultdict(<type 'list'>, {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), (5, 5)], 4: [(6, 6)]})
Upvotes: 5
Reputation: 1123860
Use .extend
instead of .append
for merging lists together.
>>> example = [1, 2, 3]
>>> example.append([4, 5])
>>> example
[1, 2, 3, [4, 5]]
>>> example.extend([6, 7])
>>> example
[1, 2, 3, [4, 5], 6, 7]
Moreover, you can loop over the keys and values of both a
and b
together using itertools.chain
:
from itertools import chain
all = {}
for k, v in chain(a.iteritems(), b.iteritems()):
all.setdefault(k, []).extend(v)
.setdefault()
looks up a key, and sets it to a default if it is not yet there. Alternatively you could use collections.defaultdict
to do the same implicitly.
outputs:
>>> a
{1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4)]}
>>> b
{3: [(5,5)], 4: [(6,6)]}
>>> all
{1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), (5, 5)], 4: [(6, 6)]}
Note that because we now create a clean new list for each key first, then extend, your original lists in a
are unaffected. In your code you do not create a copy of the list; instead you copied the reference to the list. In the end both the all
and the a
dict values point to the same lists, and using append on those lists results in the changes being visible in both places.
It's easy to demonstrate that with simple variables instead of a dict:
>>> foo = [1, 2, 3]
>>> bar = foo
>>> bar
[1, 2, 3]
>>> bar.append(4)
>>> foo, bar
([1, 2, 3, 4], [1, 2, 3, 4])
>>> id(foo), id(bar)
(4477098392, 4477098392)
Both foo
and bar
refer to the same list, the list was not copied. To create a copy instead, use the list()
constructor or use the [:]
slice operator:
>>> bar = foo[:]
>>> bar.append(5)
>>> foo, bar
([1, 2, 3, 4], [1, 2, 3, 4, 5])
>>> id(foo), id(bar)
(4477098392, 4477098536)
Now bar
is a new copy of the list and changes no longer are visible in foo
. The memory addresses (the result of the id()
call) differ for the two lists.
Upvotes: 6
Reputation: 20339
As an explanation of why your a
changes, consider your loop:
for k in a.keys():
if k in all:
all[k].append(a[k])
else:
all[k] = a[k]
So, if k
is not yet in all
, you enter the else
part and now, all[k]
points to the a[k]
list. It's not a copy, it's a reference to a[k]
: they're basically the same object. At the next iteration, all[k]
is defined, and you append to it: but as all[k]
points to a[k]
, you end up also appending to a[k]
.
You want to avoid a all[k] = a[k]
. You could try that:
for k in a.keys():
if k not in all:
all[k] = []
all[k].extend(a[k])
(Note the extend
instead of the append
, as pointed out by @Martijn Pieters). Here, you never have all[k]
pointing to a[k]
, so you're safe. @Martijn Pieters' answer is far more concise and elegant, though, so you should go with it.
Upvotes: 1