Reputation: 69
I have a nested list that looks something like this:
first_list = [[a, 1], [b, 3], [a, 6], [a, 2], [b, 4], [b, 5], ...]
I want to group these by their first element, and create a new nested list that looks like:
new_list = [ [1, 6, 2, ...], [3, 4, 5, ...], ...]
where all of the elements that started with a go in the first sublist and so on. The number of different values a, b, etc. are not known before runtime, or I could do something like:
a_list = []
b_list = []
for tag, x in first_list:
if tag == a:
a_list.append(x)
elif tag == b:
b_list.append(x)
new_list = [a_list, b_list]
I am struggling to adapt this for an arbitrary number of tags, however.
I probably omitted an important part of the question, but I should say that I already have a list of "tags", i.e:
tags = [a, b, c, d, ...]
They are not actually characters, hence the lack of inverted commas, but they should be hashable in any case.
Upvotes: 1
Views: 1820
Reputation: 164693
With Python, and programming in general, you should avoid creating a variable number of variables.
You can use a defaultdict
of list
objects. This extends naturally to an arbitrary number of groups without having to name variables explicitly.
first_list = [['a', 1], ['b', 3], ['a', 6], ['a', 2], ['b', 4], ['b', 5]]
from collections import defaultdict
dd = defaultdict(list)
for cat, num in first_list:
dd[cat].append(num)
defaultdict(list, {'a': [1, 6, 2],
'b': [3, 4, 5]})
The defaultdict
solution has O(n) complexity, but an aptly named itertools.groupby
solution is possible which requires sorting and O(n log n) complexity:
from itertools import groupby
from operator import itemgetter
sorter = sorted(first_list, key=itemgetter(0))
grouper = groupby(sorter, key=itemgetter(0))
res = {i: list(map(itemgetter(1), j)) for i, j in grouper}
{'a': [1, 6, 2], 'b': [3, 4, 5]}
This is as trivial as calling list
on dict.values
:
res_list = list(res.values())
Upvotes: 4
Reputation: 967
Okay, there's a built-in methods for this in python , but in abstract algorithmic way , we could say:
first_list = [["a", 1], ["b", 3], ["a", 6], ["a", 2], ["b", 4], ["b", 5],["c",5]]
indx_list = [x[0] for x in first_list]
new_list = [[] for x in range(len(first_list))]
for x in first_list:
new_list[indx_list.index(x[0])] += [x[-1]]
print(new_list)
Upvotes: 0
Reputation: 7980
This is a great opportunity to use the itertools
library and a list-comprehension:
first_list = [['a', 1], ['b', 3], ['a', 6], ['a', 2], ['b', 4], ['b', 5], ...]
keyfunc = lambda x: x[0]
new_list = [[v1[1] for v1 in v] for k, v in itertools.groupby(sorted(first_list, key = keyfunc), key = keyfunc)]
What I'm doing here is grouping the list by the first value in the sublist and pulling the second value off. Note that the list needs to be sorted beforehand so this will run in O(n log n)
time.
Upvotes: 0
Reputation: 467
A reduce would work for any numbers of tags.
first_list = [['a', 1], ['b', 3], ['a', 6], ['a', 2], ['b', 4], ['b', 5]]
def lambda_group(acc, val):
tag, x = val
if key not in acc:
acc[key] = []
acc[key].append(value)
return acc
grouped_vals = reduce(lambda_group, first_list, {})
regrouped = list(grouped_vals.values())
Produces [[1, 6, 2], [3, 4, 5]]
Upvotes: 0
Reputation: 9018
First of all, your a and b should probably be strings.
You can do this using list comprehensions:
first_list = [["a", 1], ["b", 3], ["a", 6], ["a", 2], ["b", 4], ["b", 5]]
a_list = [x for x in first_list if x[0] == "a"]
b_list = [x for x in first_list if x[0] == "b"]
new_list = [a_list, b_list]
Upvotes: 1