user13558017
user13558017

Reputation:

Capture repetitive pattern only if the string/line starts or contains with a specific char before the group

Hello I am trying to capture a repetitive pattern x.x.x.x where x is a number (one digit or more) but to capture only if the whole line or string starts or contains a "@" before the group:

This is a test @ 1.2.3.4 which 11.2.38.49 should pass 3.2.4.5 and capture the 3 groups

This @ is a test 1.2.3.4 which 1.2.3.4 should pass 3.2.4.5 too

@This is a test 1.2.3.4 which 1.2.3.4 should pass 3.2.4.5 too

This is a test 1.2.3.4 which @ 1.2.3.4 pass 3.2.4.5 but don't capture the first group

This test 1.2.3.4 1.2.3.4 should @ pass but capture nothing

This test 1.2.3.4 1.2.3.4 should test fail

so far I have (\d*\.\d+)+ is it even possible to find a regex for it?

https://regex101.com/r/G81BBj/1

Upvotes: 0

Views: 68

Answers (2)

The fourth bird
The fourth bird

Reputation: 163457

There are different regex engines where some support different features than others.

If you want to match all occurrences of a dot-separated chunk, you could make use of a quantifier in a lookbehind assertion.

Match 1+digits, and repeat matching a dot and 1+ digits at least one or more times to prevent matching only digits.

(?<=@.*)\d+(?:\.\d+)+
  • (?<=@.*) Positive lookbehind, assert that there is an @ on the left
  • \d+ Match 1+ digits
  • (?: Non capture group
    • \.\d+ Match a dot and 1+ digits
  • )+ Close group and repeat 1+ times to make sure there is at least 1 dot present

.NET regex demo


Another option could be when the engine supports the \G anchor which will assert either at the start of the string, or asserts the position at the end of the previous match.

(?:^[^\r\n@]*@|\G(?!^)).*?(\d+(?:\.\d+)+)
  • (?: Non capture group
  • ^[^\r\n@]*@ Match from the start of the string until the first @
    • | Or
    • \G(?!^) Assert the position at the end of the previous match, not at the start
  • ) Close group
  • .*? Match as least char as possible
  • ( Capture group 1
    • \d+(?:\.\d+)+ Match a dot-separated chunk
  • ) Close group 1

Regex demo

If the engine does not support \G and the engine will recognize \G as a G char, it will first try to match until the first occurrence of an @ followed by matching until the dot-separated chunk.

After matching the first dot-separated chunk, it tries for all the following positions to match the first part of the alternation, which can not match due to the ^ anchor. It tries the second part, but that will not match because there is no G char in the example data so eventually there will be only a single match.


If \K to is also supported to clear the starting point of the reported match, for example in pcre, you could omit the capturing group and get the match only:

(?:^[^\r\n@]*@|\G(?!^)).*?\K\d+(?:\.\d+)+

Regex demo

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

With any regex engine that supports unknown width patterns in lookbehind (.NET, ECMAScript 2018+, PyPi regex module in Python, JGSoft Software), you may use

(?<=@.*?)\d+(?:\.\d+)*

See the regex demo.

Details

  • (?<=@.*?) - a location that is immediately preceded with @ char that may be followed with any 0 or more chars other than line break chars, as few as possible
  • \d+(?:\.\d+)* - one or more digits followed with 0 or more occurrences of a dot and then 1+ digits.

Upvotes: 0

Related Questions