Reputation: 391
I have a list of lists in python. I want to group similar lists together. That is, if first three elements of each list are the same then those three lists should go in one group. For eg
[["a", "b", "c", 1, 2],
["d", "f", "g", 8, 9],
["a", "b", "c", 3, 4],
["d","f", "g", 3, 4],
["a", "b", "c", 5, 6]]
I want this to look like
[[["a", "b", "c", 1, 2],
["a", "b", "c", 5, 6],
["a", "b", "c", 3, 4]],
[["d","f", "g", 3, 4],
["d", "f", "g", 8, 9]]]
I could do this by running an iterator and manually comparing each element of two consecutive lists and then based on the no of elements within those lists that were same I can decide to group them together. But i was just wondering if there is any other way or a pythonic way to do this.
Upvotes: 2
Views: 3373
Reputation: 107287
You can use itertools.groupby
:
>>> A=[["a", "b", "c", 1, 2],
... ["d", "f", "g", 8, 9],
... ["a", "b", "c", 3, 4],
... ["d","f", "g", 3, 4],
... ["a", "b", "c", 5, 6]]
>>> from operator import itemgetter
>>> [list(g) for _,g in groupby(sorted(A),itemgetter(0,1,2)]
[[['a', 'b', 'c', 1, 2], ['a', 'b', 'c', 3, 4], ['a', 'b', 'c', 5, 6]], [['d', 'f', 'g', 3, 4], ['d', 'f', 'g', 8, 9]]]
Upvotes: 6
Reputation: 180391
You don't need to sort, you can group in a dict using a tuple of the first three elements from each list as the key:
from collections import OrderedDict
l=[
["a", "b", "c", 1, 2],
["d", "f", "g", 8, 9],
["a", "b", "c", 3, 4],
["d","f", "g", 3, 4],
["a", "b", "c", 5, 6]
]
od = OrderedDict()
for sub in l:
k = tuple(sub[:3])
od.setdefault(k,[]).append(sub)
from pprint import pprint as pp
pp(od.values())
[[['a', 'b', 'c', 1, 2], ['a', 'b', 'c', 3, 4], ['a', 'b', 'c', 5, 6]],
[['d', 'f', 'g', 8, 9], ['d', 'f', 'g', 3, 4]]]
Which is O(n)
as opposed to O(n log n)
.
If you don't care about order use a defaultdict:
from collections import defaultdict
od = defaultdict(list)
for sub in l:
a, b, c, *_ = sub # python3
k = a,b,c
od[k].append(sub)
from pprint import pprint as pp
pp(list(od.values()))
[[['a', 'b', 'c', 1, 2], ['a', 'b', 'c', 3, 4], ['a', 'b', 'c', 5, 6]],
[['d', 'f', 'g', 8, 9], ['d', 'f', 'g', 3, 4]]]
Upvotes: 4