Swapping the inner and outer keys of a nested dictionary

Question

I have the following dictionary

{'a':{'se':3, 'op':2}, 'b':{'se':4, 'op':3}}

I need to convert it as follows:

{'se':{'a':3, 'b': 4}, 'op':{'a':2,'b':3}}

This is the following code I could come up with:

from collections import defaultdict

a = {'a':{'se':3, 'op':2}, 'b':{'se':4, 'op':3}}
b = defaultdict(dict)
for key1, value1 in a.items():
    for key2, value2 in value1.items():
        b[key2].update({key1: value2})

The following gets the job done but I am fond of one-liners. Is there a one-liner to the above or even a better way (better performance such as to eliminate two loops)?

abarnert · Accepted Answer

Turning this into a one-liner is doable in multiple ways, but all of them are going to be ugly. For example:

# Gather k2-k1-v2
g = ((k2,k1,v2) for k1,v1 in a.items() for k2,v2 in v1.items())
# Sort by k2
sg = sorted(g)
# Group by k2
gsg = itertools.groupby(sg, key=operator.itemgetter(0))
# Turn it into a dict of dicts
b = {k: {ksk[1]: ksk[2] for ksk in group} for (k, group) in gsg}

Putting it all together:

b = {k: {ksk[1]: ksk[2] for ksk in group} 
     for (k, group) in itertools.groupby(
         sorted(((k2,k1,v2) for k1,v1 in a.items() for k2,v2 in v1.items())),
         key=operator.itemgetter(0))}

Well, it's one expression, and you can put it all on one line if you don't know how many columns it is. But it's certainly not as readable as your original version.

And as for performance? It takes about twice as long from a quick test. Coldspeed's version is somewhere in between. Changing one list to an iterator makes it slightly slower on small dicts like your original example, but much faster on big ones, but either way, it didn't beat the original in any tests, and got slower than the itertools version with very big values. But of course if performance actually matters, you should measure them on your real data.

And if you think about it, there can't be any way to eliminate the nested loop (except by replacing one of the loops with something equivalent, like recursion, or unrolling it based on the fact that your example happens to have exactly two items in each inner dict—which presumably isn't true for your real problem). After all, you have to visit each key of each inner dictionary, which you can't do without a loop over the outer dictionary. It is possible to turn those loops into comprehension loops instead of statements, or C loops inside map or list (or maybe inside some Pandas function?). Both my version and Coldspeed's put the nested loop in a comprehension, and at least one of the additional linear loops (which don't add to the algorithmic complexity, but probably do add significantly to the real-life time for small collections like your example) are buried inside a builtin function. But speeding up the looping while doing more overall work isn't always a worthwhile tradeoff.

Swapping the inner and outer keys of a nested dictionary

Answers (2)

Related Questions