How to get a list of character positions in Python?

Question

I'm trying to write a function to sanitize unicode input in a web application, and I'm currently trying to reproduce the PHP function at the end of this page : http://www.iamcal.com/understanding-bidirectional-text/

I'm looking for an equivalent of PHP's preg_match_all in python. RE function findall returns matches without positions, and search only returns the first match. Is there any function that would return me every match, along with the associated position in the text ?

With a string abcdefa and the pattern a|c, I want to get something like [('a',0),('c',2),('a',6)]

Thanks :)

samplebias · Accepted Answer

Try:

text = 'abcdefa'
pattern = re.compile('a|c')
[(m.group(), m.start()) for m in pattern.finditer(text)]

How to get a list of character positions in Python?

Answers (2)

Related Questions