Colin
Colin

Reputation: 10820

Split string into strings of repeating elements

I want to split a string like:

'aaabbccccabbb'

into

['aaa', 'bb', 'cccc', 'a', 'bbb']

What's an elegant way to do this in Python? If it makes it easier, it can be assumed that the string will only contain a's, b's and c's.

Upvotes: 8

Views: 359

Answers (4)

jamylak
jamylak

Reputation: 133554

>>> import re
>>> s = 'aaabbccccabbb'
>>> [m.group() for m in re.finditer(r'(\w)(\1*)',s)]
['aaa', 'bb', 'cccc', 'a', 'bbb']

Upvotes: 1

Jacob Eggers
Jacob Eggers

Reputation: 9322

Here's the best way I could find using regex:

print [a for a,b in re.findall(r"((\w)\2*)", s)]

Upvotes: 2

jsbueno
jsbueno

Reputation: 110301

You can create an iterator - without trying to be smart just to keep it short and unreadable:

def yield_same(string):
    it_str = iter(string)
    result = it_str.next()
    for next_chr in it_str:
        if next_chr != result[0]:
            yield result
            result = ""
        result += next_chr
    yield result


.. 
>>> list(yield_same("aaaaaabcbcdcdccccccdddddd"))
['aaaaaa', 'b', 'c', 'b', 'c', 'd', 'c', 'd', 'cccccc', 'dddddd']
>>> 

edit ok, so there is itertools.groupby, which probably does something like this.

Upvotes: 3

Niklas B.
Niklas B.

Reputation: 95308

That is the use case for itertools.groupby :)

>>> from itertools import groupby
>>> s = 'aaabbccccabbb'
>>> [''.join(y) for _,y in groupby(s)]
['aaa', 'bb', 'cccc', 'a', 'bbb']

Upvotes: 26

Related Questions