Eric Zajac
Eric Zajac

Reputation: 65

Group Tuples Into List

In Python, what is the best approach to group tuples with a common index?

(2, 3, 'z')
(1, 1, 'abc')
(2, 1, 'stu')
(1, 2, 'def')
(2, 2, 'vxy')

Result would be:

[((1, 1, 'abc'),(1, 2, 'def')]
[((2, 1, 'stu'),(2, 2, 'vxy'), (2, 2, 'vxy')]

The goal is to concatenate the 3rd element into a single string object.

Here is the concat part, but I am not sure on the grouping.

def sort_tuples(list_input):
    new = sorted(list_input)
    str = ''
    for i in range(0, len(new)):
        str = str + new[i][2]
    return str

Upvotes: 1

Views: 781

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121486

Use a dictionary to group; pick your grouping element and append what you want to concatenate to a list per key:

groups = {}
for first, second, third in list_input:
    groups.setdefault(first, []).append(third)

Then you can just concatenate each list:

for key, group in groups.items():
    print(key, ''.join(group))

Since you only wanted to concatenate the third element of each tuple, I didn't bother with including the second element in the dictionary, but you are free to store the whole tuple in the group lists too.

Demo:

>>> list_input = [
...     (2, 3, 'z'),
...     (1, 1, 'abc'),
...     (2, 1, 'stu'),
...     (1, 2, 'def'),
...     (2, 2, 'vxy'),
... ]
>>> groups = {}
>>> for first, second, third in list_input:
...     groups.setdefault(first, []).append(third)
... 
>>> for key, group in groups.items():
...     print(key, ''.join(group))
... 
1 abcdef
2 zstuvxy

If the second key was being used as a sorting key, then you'll have to include that when grouping; you can then sort and extract the third:

groups = {}
for first, second, third in list_input:
    groups.setdefault(first, []).append((second, third))

for key, group in groups.items():
    print(key, ''.join([third for second, third in sorted(group)]))

Demo:

>>> groups = {}
>>> for first, second, third in list_input:
...     groups.setdefault(first, []).append((second, third))
... 
>>> for key, group in groups.items():
...     print(key, ''.join([third for second, third in sorted(group)]))
... 
1 abcdef
2 stuvxyz

Since this involves sorting, you may as well sort the whole input list once, and use itertools.groupby() to group your input after sorting:

from itertools import groupby

for key, group in groupby(sorted(list_input), key=lambda t: t[0]):
    print(key, ''.join([third for first, second, third in group]))

Once more, a demo of this approach:

>>> from itertools import groupby
>>> for key, group in groupby(sorted(list_input), key=lambda t: t[0]):
...     print(key, ''.join([third for first, second, third in group]))
... 
1 abcdef
2 stuvxyz

The dictionary grouping approach is a O(N) algorithm), as soon as you add sorting it becomes an O(NlogN) algorithm.

Upvotes: 1

Related Questions