Mateen Bagheri
Mateen Bagheri

Reputation: 99

How to split a string by character sets that are different in python

I want to split an string I have by characters that are different than the others into a list. for example, if I have string ccaaawq, I want my program to give me ['cc', 'aaa', 'w', 'q']. Since there is no single differentiator between each split, I'm wondering what is the best approach to solving this problem. thanks in advance for your answers

Upvotes: 1

Views: 45

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521239

Here is a regex find all approach:

inp = "ccaaawq"
output = [x[0] for x in re.findall(r'((.)\2*)', inp)]
print(output)  # ['cc', 'aaa', 'w', 'q']

The above works by matching any one character followed by that same character zero or more times. These matches are then stored in the first capture group, which we extract from the 2D list output.

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195438

You can use itertools.groupby:

from itertools import groupby

s = "ccaaawq"

out = ["".join(g) for _, g in groupby(s)]
print(out)

Prints:

['cc', 'aaa', 'w', 'q']

Upvotes: 1

Related Questions