Ryan Stanfield
Ryan Stanfield

Reputation: 53

Grouping string into substrings

I need some explanation as to how this code works. I do not understand the need for 'str' and 'grp' within the for loop. What are they keeping track of?

from itertools import groupby
print(["".join(grp) for str, grp in groupby('aaacaccccccbbbb')])

Upvotes: 0

Views: 361

Answers (2)

Patrick Artner
Patrick Artner

Reputation: 51633

Do not use built-ins as variable names: str,int,set,dict,tuple,list,max,min,...

If in doubt, dissabssemble list comprehensions into its parts and supply them to print statements (How to debug small programs):

from itertools import groupby
grouping = groupby('aaacaccccccbbbb')

for stri, grp in grouping: 
    print(stri)              # key of the grouping
    print(list(grp))         # group (use list to show it instead of the groupingiterable)
    print("")

Output:

a
['a', 'a', 'a']

c
['c']

a
['a']

c
['c', 'c', 'c', 'c', 'c', 'c']

b
['b', 'b', 'b', 'b']

If you still got questions about it, read the API or search SO: How do I use Python's itertools.groupby()?

Upvotes: 0

cs95
cs95

Reputation: 402263

groupby groups consecutive iterators by some key. If no key is specified, the default grouping predicate is that the consecutive elements should be the same. So, to summarise, groupby groups identical consecutive elements together.

Exhausting the groupby, you see it returns tuples:

list(groupby('aaacaccccccbbbb'))

[('a', <itertools._grouper at 0x12f132a58>),
 ('c', <itertools._grouper at 0x12f132d30>),
 ('a', <itertools._grouper at 0x12f132cf8>),
 ('c', <itertools._grouper at 0x12f1b9da0>),
 ('b', <itertools._grouper at 0x12f1a68d0>)]

Each tuple is a pair of <group_key, [group_values_iterator]>, which corresponds to str and grp in the list comprehension. grp is basically the elements in that group. The list comprehension is exhausting the grp iterator and joining the characters together.

Upvotes: 1

Related Questions