Reputation: 11

Regex repeated groups

I have this text:

<span id="3">

HELLO THERE
<span id="5">
Other stuff
<span id="6">
Other Stuff
<span id="7">
Other sutff

I need to grab just the <span...> elements after the HELLO THERE text. So in the above example, all the spans except for the one with id=3.

So I tried (<span.+?>)+ which grabs all the spans. Next, I tried HELLO THERE.+?(<span.+?>)+, but that only grabs the first relevant one. So my question is, what is the right regex to use here?

Upvotes: 1

Answers (2)

Emma

Reputation: 27743

RegEx 1

Here, we can use several expressions that would get the desired <span> opening tags. For example, we can simply use:

\s(<.+)

with a space boundary on the left and a capturing group which would do that.

Demo

RegEx 2

Another alternative which is more expensive with higher complexity would be:

([\s\S].*?)(<.+>)

Demo

RegEx 3

Then, we can reduce the complexity and improve the performance with this expression:

([\s\S].*?)(<.+>)*

Demo

RegEx Circuit

Here, we can also visualize our expressions in jex.im:

Upvotes: 2

Joanna Derks

Reputation: 4063

This regex will capture all the tags after Hello There into the matching groups:

HELLO THERE(?:(?:.*?)(<span[^>]+>))+

HELLO THERE - match the beginning
Inside the non capturing group:
(?:.*?) - match optionally any text until you find
(<span[^>]+>) - the span tag - this one will be captured
+ - repeat the previous 2 steps until no other span tags can be found

You also need to set your matching options to dot matches new line.

Upvotes: 0

Regex repeated groups

Answers (2)

RegEx 1

Demo

RegEx 2

Demo

RegEx 3

Demo

RegEx Circuit

Related Questions