GGVan
GGVan

Reputation: 43

merging two dictionaries of lists with the same keys in python

My problem:

I'm trying to merge two dictionaries of lists into a new dictionary, alternating the elements of the 2 original lists for each key to create the new list for that key.

So for example, if I have two dictionaries:

strings = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}

Ns = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}

I want to merge these two dictionaries so that the final dictionary will look like:

strings_and_Ns = {'S1': ["string0", "N0", "string1", "N1", "string2"], 'S2': ["string0", "N0", "string1"]}

or better yet, have the strings from the list joined together for every key, like:

strings_and_Ns = {'S1': ["string0N0string1N1string2"], 'S2': ["string0N0string1"]}

(I'm trying to connect together DNA sequence fragments.)

What I've tried so far:

zip

 for S in Ns:   
     newsequence = [zip(strings[S], Ns[S])]
     newsequence_joined = ''.join(str(newsequence))
     strings_and_Ns[species] = newsequence_joined

This does not join the sequences together into a single string, and the order of the strings are still incorrect.

Using a defaultdict

from collections import defaultdict
strings_and_Ns = defaultdict(list)

    for S in (strings, Ns):
        for key, value in S.iteritems():
        strings_and_Ns[key].append(value)

The order of the strings for this is also incorrect...

Somehow moving along the lists for each key...

for S in strings: 
    list = strings[S]
    L = len(list)
    for i in range(L):
        strings_and_Ns[S] = strings_and_Ns[S] + strings[S][i] + strings[S][i]

Upvotes: 4

Views: 2266

Answers (5)

jfs
jfs

Reputation: 414149

To alternate x, y iterables inserting default for missing values:

from itertools import izip_longest

def alternate(x, y, default):
    return (item for pair in izip_longest(x, y, default) for item in pair)

Example

a = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}
b = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}
assert a.keys() == b.keys()
merged = {k: ''.join(alternate(a[k], b[k], '')) for k in a}
print(merged)

Output

{'S2': 'string0N0string1', 'S1': 'string0N0string1N1string2'}

Upvotes: 2

Peter Gibson
Peter Gibson

Reputation: 19544

Similar to the other solutions posted, but I would move some of it off into a function

import itertools   

def alternate(*iters, **kwargs):
    return itertools.chain(*itertools.izip_longest(*iters, **kwargs))

result = {k: ''.join(alternate(strings[k], Ns[k] + [''])) for k in Ns}
print result

Gives:

{'S2': 'string0N0string1', 'S1': 'string0N0string1N1string2'}

The alternate function is from https://stackoverflow.com/a/2017923/66349. It takes iterables as arguments and chains together items from each one successively (using izip_longest as Padraic Cunningham did).

You can either specify fillvalue='' to handle the different length lists, or just manually pad out the shorter list as I have done above (which assumes Ns will always be one shorter than strings).

If you have an older python version that doesn't support dict comprehension, you could use this instead

result = dict((k, ''.join(alternate(strings[k], Ns[k] + ['']))) for k in Ns)

Upvotes: 1

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

itertools.izip_longest will take care of the uneven length lists, then just use str.join to join into one single string.

strings = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}

Ns = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}

from itertools import izip_longest as iz

strings_and_Ns = {k:["".join([a+b for a, b in iz(strings[k],v,fillvalue="")])] for k,v in Ns.items()}

print(strings_and_Ns)
{'S2': ['string0N0string1'], 'S1': ['string0N0string1N1string2']}

Which is the same as:

strings_and_Ns  = {}
for k, v in Ns.items():
     strings_and_Ns[k] = ["".join([a + b for a, b in iz(strings[k], v, fillvalue="")])]

Using izip_longest means the code will work no matter which dict's values contain more elements.

Upvotes: 1

John Zwinck
John Zwinck

Reputation: 249123

strings_and_Ns = {}
for k,v in strings.items():
    pairs = zip(v, Ns[k] + ['']) # add empty to avoid need for zip_longest()
    flat = (item for sub in pairs for item in sub)
    strings_and_Ns[k] = ''.join(flat)

flat is built according to the accepted answer here: Making a flat list out of list of lists in Python

Upvotes: 3

wenzul
wenzul

Reputation: 4048

You could do it with itertools or with list slicing stated here. The result looks pretty smart with itertools.

strings_and_Ns = {}
for skey, sval in strings.iteritems():
    iters = [iter(sval), iter(Ns[skey])]
    strings_and_Ns[skey] = ["".join(it.next() for it in itertools.cycle(iters))]

You have to take care about the corresponding length of your lists. If one iterator raise StopIteration the merging ends for that key.

Upvotes: 2

Related Questions