Emilio
Emilio

Reputation: 1354

RegEx: How to make greedy quantifier really really greddy (and never give back)?

For example I have this RegEx:

([0-9]{1,4})([0-9])

enter image description here

Which gives me these matching groups when testing with string "3041":

enter image description here

As you can see, group2 is filled before group1 even if the quantifier is greedy.

How can I instead make sure to fill group1 before group2?

EDIT1: I want to have the same regEx, but have "3041" in group1 and group2 empty.

EDIT2: I want to have "3041" in group1 and group2 empty. And, yes, I want the regEx to not match!,

Upvotes: 1

Views: 160

Answers (2)

Tom Lord
Tom Lord

Reputation: 28305

For an input "1234", the pattern: ([0-9]{1,4})([0-9]) is being as greedy as possible.

The first capture group cannot contain four characters, otherwise the last part of the pattern would not match.

Perhaps what you're looking for is:

([0-9]{1,4})([0-9]?)

By making the second group optionally empty, the first group can contain all four characters.

Edit:

I want the regEx to not match!, I want only 5 digits strings to match the whole RegEx.

In this case, your pattern should not really be "1-4 characters" in the first group, since you only want to match a group of 4:

([0-9]{4})([0-9])

In some regex flavours (i.e. not all languages support this), it is also possible to make quantifiers possessive (although this is unnecessary in your case, as shown above). For example:

([0-9]{1,4}+)([0-9])

This will force the first group to match as far as it can (i.e. 4 characters), so a 3-character match does not get attempted and the overall pattern fails to match.

Edit2:

Is "possessiveness" available in Javascript? If not, any workarounds?

Unfortunately, possessive quantifiers are not available in JavaScript.

However, you can emulate the behaviour (in a slightly ugly way) with a lookahead:

(?=([0-9]{1,4}))\1([0-9])

In general, a possessive quantifier a++ can be emulated as: (?=(a+))\1.

Upvotes: 2

Jan
Jan

Reputation: 43169

As it stands you only need anchors:

^([0-9]{4})([0-9])$

This will only match five digits strings and will fail on any other string.

Upvotes: 1

Related Questions