coderkid
coderkid

Reputation: 658

Counting runs in a string

I have a string that looks like:

string = 'TTHHTHHTHHHHTTHHHTTT'

How can I count the number of runs in the string so that I get,

5 runs of T and 4 runs of H

Upvotes: 11

Views: 1471

Answers (2)

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 251051

You can use a combination of itertools.groupby and collections.Counter:

>>> from itertools import groupby
>>> from collections import Counter
>>> strs = 'TTHHTHHTHHHHTTHHHTTT'
>>> Counter(k for k, g in groupby(strs))
Counter({'T': 5, 'H': 4})

itertools.groupby groups the item based on a key.(by default key is the items in the iterable itself)

>>> from pprint import pprint
>>> pprint([(k, list(g)) for k, g in groupby(strs)])
[('T', ['T', 'T']),
 ('H', ['H', 'H']),
 ('T', ['T']),
 ('H', ['H', 'H']),
 ('T', ['T']),
 ('H', ['H', 'H', 'H', 'H']),
 ('T', ['T', 'T']),
 ('H', ['H', 'H', 'H']),
 ('T', ['T', 'T', 'T'])]

Here first item is the key(k) based on which the items were grouped and list(g) is the group related to that key. As we're only interested in key part, so, we can pass k to collections.Counter to get the desired answer.

Upvotes: 21

iruvar
iruvar

Reputation: 23374

For variety, an re-based approach

import re
letters = ['H', 'T']
matches = re.findall(r'({})\1*'.format('|'.join(letters)), 'TTHHTHHZTHHHHTTHHHTTT')      
print matches
['T', 'H', 'T', 'H', 'T', 'H', 'T', 'H', 'T']
[(letter, matches.count(letter)) for letter in letters]
[('H', 4), ('T', 5)]

Upvotes: 2

Related Questions