yanivps
yanivps

Reputation: 2163

Regex match sequence more than once

How come for something that simple I can't find an answer after looking one hour in the internet?

I have this sentence:

HeLLo woRLd HOw are YoU

I want to capture all groups that consist of two following capital letters

[A-Z]{2}

The regex above works but capture only LL (the first two capital letters) while I want LL in one group and in the other groups also RL HO

Upvotes: 2

Views: 5794

Answers (2)

npinti
npinti

Reputation: 52185

Most regular expression engines expose some way to make your expression global. This means that your expression will applied multiple times. This global flag is usually denoted with the /g marker at the end of your expression. This is your regular expression without the /g flag, while this is what happens when you apply said flag.

Different languages expose such functionality differently, in C# for instance, this is done through the Regex.Matches syntax. In Java, you use while(matcher.find()), which keeps providing sub strings which match the pattern provided.

EDIT: I am not a Python person, but judging from the example available here, you could do something like so:

it = re.finditer(r"[A-Z]{2}", "HeLLo woRLd HOw are YoU")
for match in it:
    print "'{g}' was found between the indices {s}".format(g=match.group(), s=match.span())

Upvotes: 5

Amit Joki
Amit Joki

Reputation: 59282

You can not have multiple groups in this case, but you can have multiple matches. Add the global flag to your regex and use a method to match the regex.

For javscript, it would be /[A-Z]{2}/g. The method most probably returns an Array of matches, and you can use index to access them.

Upvotes: 0

Related Questions