Splitting a string into consecutive counts?

Question

For example, if the given string is this:

"aaabbbbccdaeeee"

I want to say something like:

3 a, 4 b, 2 c, 1 d, 1 a, 4 e

It is easy enough to do in Python with a brute force loop, but I am wondering if there is a more Pythonic / cleaner one-liner type of approach.

My brute force:

while source!="":
    leading = source[0]
    c=0
    while source!="" and source[0]==leading:
        c+=1
        source=source[1:]
    print(c, leading)

dawg · Accepted Answer

Use a Counter for a count of each distinct letter in the string regardless of position:

>>> s="aaabbbbccdaeeee"
>>> from collections import Counter
>>> Counter(s)
Counter({'a': 4, 'b': 4, 'e': 4, 'c': 2, 'd': 1})

You can use groupby if the position in the string has meaning:

from itertools import groupby
li=[]
for k, l in groupby(s):
    li.append((k, len(list(l))))

print li

Prints:

[('a', 3), ('b', 4), ('c', 2), ('d', 1), ('a', 1), ('e', 4)]

Which can be reduce to a list comprehension:

[(k,len(list(l))) for k, l in groupby(s)]

You can even use a regex:

>>> [(m.group(0)[0], len(m.group(0))) for m in re.finditer(r'((\w)\2*)', s)] 
[('a', 3), ('b', 4), ('c', 2), ('d', 1), ('a', 1), ('e', 4)]

Splitting a string into consecutive counts?

Answers (2)

Related Questions