aybe
aybe

Reputation: 16662

Repeated group captures more matches than specified

I am trying to match the following sequence, line by line:

The problem is that the 4th sample is also captured even though it has 5 numbers in it.

Pattern:

^\s*Kd\s+.*(?:[-+]?0*\d*\.?\d*){3,4}$

Samples:

Kd   1.0  0.1   0.0
   Kd   .0  4.   01.
  Kd   .0  4.   01.  01.
 Kd   .0  4.   01. 01. 01.
    Kd   1.0  0.1   0.0  0.0
  Kd   1.0  0.1   0.0

Expected captures:

Question:

What am I doing wrong in the regex so it also matches lines with more than 4 floats in them?

Upvotes: 2

Views: 30

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

The main problem is the .* part that matches any zero or more chars other than an LF char, as many times as possible. You also need to put \s+ into the repeated group so as to allow whitespaces in between the numeric values.

You can use

^\s*Kd(?:\s+([-+]?(?:\d*\.?\d+|\d+\.\d*))){3,4}$

See the .NET regex demo. Details:

  • ^ - start of string
  • \s* - zero or more whitespaces
  • Kd - a fixed string
  • (?:\s+([-+]?(?:\d*\.?\d+|\d+\.\d*))){3,4} - three to four occurrences of
    • \s+ - one or more whitespaces
    • ([-+]?(?:\d*\.?\d+|\d+\.\d*)) - Group 1:
      • [-+]? - an optional - or +
      • (?:\d*\.?\d+|\d+\.\d*) - either zero or more digits, an optional . and one or more digits, or one or more digits, . and zero or more digits
  • $ - end of string.

Upvotes: 1

Related Questions