Reputation: 22520
I am testing itertools.groupby()
and try to get the groups as lists but can't figure out how to make it work.
using the examples here, in How do I use Python's itertools.groupby()?
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
I tried (python 3.5):
g = groupby(things, lambda x: x[0])
ll = list(g)
list(tuple(ll[0])[1])
I thought I should get the first group ("animal") as a list ['bear', 'duck']
. But I just get an empty list on REPL.
What am I doing wrong?
How should I extract all three groups as lists?
Upvotes: 3
Views: 1215
Reputation: 155507
If you just want the groups, without the keys, you need to realize the group generators as you go, per the docs:
Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.
This means that when you try to list
-ify the groupby
generator first using ll = list(g)
, before converting the individual group generators, all but the last group generator will be invalid/empty.
(Note that list
is just one option; a tuple
or any other container works too).
So to do it properly, you'd make sure to list
ify each group generator before moving on to the next:
from operator import itemgetter # Nicer than ad-hoc lambdas
# Make the key, group generator
gen = groupby(things, key=itemgetter(0))
# Strip the keys; you only care about the group generators
# In Python 2, you'd use future_builtins.map, because a non-generator map would break
groups = map(itemgetter(1), gen)
# Convert them to list one by one before the next group is pulled
groups = map(list, groups)
# And listify the result (to actually run out the generator and get all your
# results, assuming you need them as a list
groups = list(groups)
As a one-liner:
groups = list(map(list, map(itemgetter(1), groupby(things, key=itemgetter(0)))))
or because this many map
s gets rather ugly/non-Pythonic, and list comprehensions let us do nifty stuff like unpacking to get named values, we can simplify to:
groups = [list(g) for k, g in groupby(things, key=itemgetter(0))]
Upvotes: 2
Reputation: 10951
Quoting from Python Doc on groupby
:
itertools.groupby(iterable, key=None)
Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.
>>> from itertools import groupby
>>>
>>> things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
>>>
>>>
>>> for _, g in groupby(things, lambda x:x[0]):
print(list(g))
[('animal', 'bear'), ('animal', 'duck')]
[('plant', 'cactus')]
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]
>>>
>>> from operator import itemgetter
>>> l = [list(g) for _, g in groupby(things, itemgetter(0))]
>>> l
[[('animal', 'bear'), ('animal', 'duck')], [('plant', 'cactus')], [('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
>>> from collections import defaultdict
>>>
>>> d = defaultdict(list)
>>>
>>> for k,v in groupby(things, itemgetter(0)):
for sub in v:
for item in sub:
if item != k:
d[k].append(item)
>>> d
defaultdict(<class 'list'>, {'animal': ['bear', 'duck'], 'plant': ['cactus'], 'vehicle': ['speed boat', 'school bus']})
Upvotes: 0
Reputation: 11961
You could use a list comprehension as follows:
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
g = groupby(things, lambda x: x[0])
answer = [list(group[1]) for group in g]
print(answer)
Output
[[('animal', 'bear'), ('animal', 'duck')],
[('plant', 'cactus')],
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
Upvotes: 1