Match all occurrences of string using re.findall

Question

I have a string

a = "123 some_string ABC 456 some_string DEF 789 some_string GHI"

print re.findall("(\d\d\d).*([A-Z]+)", a)

o/p : [('123', 'I')]

Expected o/p : [('123', 'ABC'), ('456', 'DEF'), ('789', 'GHI')]

Because of .* it is matching 123 and final character I. What is the proper regex, so that it prints expected o/p ?

Jan · Accepted Answer

While anubhava's expression works, consider using the principle of contrast (108 steps compared to 30 steps - a reduction by more than 70%!):

(\d{3})[^A-Z]*([A-Z]+)

See the hijacked demo on regex101.com.
The lazy dot-star is very expensive in terms of performance.

Answers (2)