Regex matched text between tags is too greedy

Question

I am trying to extract text from a string, and have trouble with laziness/greediness.

In the example I want the piece of text to match I want this piece, so my regex is non-greedy anything between and as long as it contains 'piece'.

The problem with my regex that the matched text includes first.

var text = "first I only want this piece";
var regX = /.*?piece.*?<\/b>/;
var matches = text.match(regX);

Matched text

"first I only want this piece"

Desired match

"I only want this piece"

Avinash Raj · Accepted Answer

Use a negated char class instead of the first .*?.

var regX = /[^<>]*?piece.*?<\/b>/;

Why?

Because the first .*?piece will match the first and it continues until it finds the text piece and it won't care about the text present in-between. If you use [^<>]*?, it would do a lazy match of matching any char but not of < or > character zero or more times.

Regex matched text between tags is too greedy

Answers (2)

Related Questions