Mrv
Mrv

Reputation: 41

Combining lists of tuples based on a common tuple element

Consider two lists of tuples:

data1 = [([X1], 'a'), ([X2], 'b'), ([X3], 'c')]
data2 = [([Y1], 'a'), ([Y2], 'b'), ([Y3], 'c')]

Where len(data1) == len(data2)

Each tuple contains two elements:

  1. list of some strings (i.e [X1])
  2. A common element for data1 and data2: strings 'a', 'b', and so on.

I would like to combine them into following:

[('a', [X1], [Y1]), ('b', [X2], [Y2]),...]

Does anyone know how I can do this?

Upvotes: 2

Views: 946

Answers (3)

Dimitris Fasarakis Hilliard
Dimitris Fasarakis Hilliard

Reputation: 160557

@Kasramvd's solution is good if the order is the same among all elements in the data lists. If they are not, it doesn't take that into account.

A solution that does, utilizes a defaultdict:

from collections import defaultdict

d = defaultdict(list)  # values are initialized to empty list

data1 = [("s1", 'a'), ("s2", 'c'), ("s3", 'b')]
data2 = [("s1", 'c'), ("s2", 'b'), ("s3", 'a')]

for value, common in data1 + data2:
    d[common].append(value)

In order to get a list of it, simply wrap it in a list() call:

res = list(d.items())
print(res)
# Prints: [('b', ['s3', 's2']), ('a', ['s1', 's3']), ('c', ['s2', 's1'])]

Upvotes: 5

Matthew
Matthew

Reputation: 7590

We can do this in a single comprehension expression, using the reduce function

from functools import reduce
from operator import add
[tuple([x]+reduce(add,([y[0]] for y in data1+data2 if y[1]==x))) for x in set(y[1] for y in data1+data2)]

If the lists are large, so that data1+data2 imposes a severe time or memory penalty, it might be better to pre-compute it

combdata = data1+data2
[tuple([x]+reduce(add,[y[0]] for y in combdata if y[1]==x))) for x in set(y[1] for y in combdata)]

This solution does not rely on all "keys" occurring in both lists, or the order being the same.

If returned order is important, we can even do

sorted([tuple([x]+reduce(add,([y[0]] for y in data1+data2 if y[1]==x))) for x in set(y[1] for y in data1+data2)],key = lambda x,y=[x[0] for x in data1+data2]: y.index(x[1]))

to ensure that the order is the same as in the original lists. Again, pre-computing data1+data2 gives

sorted([tuple([x]+reduce(add,([y[0]] for y in combdata if y[1]==x))) for x in set(y[1] for y in combdata)],key = lambda x,y=[x[0] for x in combdata]: y.index(x[1]))

Upvotes: 1

Kasravnd
Kasravnd

Reputation: 107337

You can use zip function and a list comprehension:

[(s1,l1,l2) for (l1,s1),(l2,s2) in zip(data1,data2)]

Upvotes: 8

Related Questions