Merging part of tuples in Python

Question

I have several hundred tuples on the following format (id1, id2, id3, [xydata]) For example:

('a', 'b', 'c', [(1, 2),(2, 3),(3, 4)])
('a', 'b', 'c', [(1, 1),(2, 4),(3, 6)])
('a', 'b', 'd', [(1, 3),(2, 6),(3, 7)])
('a', 'b', 'd', [(1, 7),(2, 8),(3, 9)])

Now I want to merge the tuples so that those that start with the same three values are combined in the following way. I am guaranteed that the same X values are in all xydata:

('a', 'b', 'c', [(1, mean(2, 1)),(2, mean(3, 4)),(3, mean(4, 6))])
('a', 'b', 'd', [(1, mean(3, 7)),(2, mean(6, 8)),(3, mean(7, 9))])

The current solution takes several steps to reorder and break out the data, storing the tuples in a multilayer dictionary before combining them and rebuilding the original datastructure. Is there a neat and Pythonic way to do this instead?

ndpu · Accepted Answer

You can merge by using defaultdict:

>>> l = [('a', 'b', 'c', [(1, 2),(2, 3),(3, 4)]),
...      ('a', 'b', 'c', [(1, 1),(2, 4),(3, 6)]),
...      ('a', 'b', 'd', [(1, 3),(2, 6),(3, 7)]),
...      ('a', 'b', 'd', [(1, 7),(2, 8),(3, 9)])]

>>> d = defaultdict(lambda:defaultdict(list))
>>> for k1,k2,k3, lst in l:
...  for t in lst:
...   d[(k1,k2,k3)][t[0]].append(t[1])

result:

>>> d
defaultdict( at 0x8e33e9c>, 
{('a', 'b', 'c'): defaultdict(, {1: [2, 1], 2: [3, 4], 3: [4, 6]}),
 ('a', 'b', 'd'): defaultdict(, {1: [3, 7], 2: [6, 8], 3: [7, 9]})})

if you need it in list:

>>> [(k, v.items()) for k,v in d.items()]
[(('a', 'b', 'c'), [(1, [2, 1]), (2, [3, 4]), (3, [4, 6])]),
 (('a', 'b', 'd'), [(1, [3, 7]), (2, [6, 8]), (3, [7, 9])])]

with mean calculation:

>>> [(k, [(n, sum(t)/float(len(t))) for n,t in v.items()]) for k,v in d.items()]
[(('a', 'b', 'c'), [(1, 1.5), (2, 3.5), (3, 5.0)]),
 (('a', 'b', 'd'), [(1, 5.0), (2, 7.0), (3, 8.0)])]

Merging part of tuples in Python

Answers (2)

Related Questions