Reputation: 1523
I have a string and I want to match a specific pattern optionally as many times as may occur.
My String
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL
After 45
until $595
There could be upto 6 more number there. How can I optionally look for repeating number in that space?
Here's what I have so far:
/([\d.]+) ([\d.]+) ([\d.]+)? (\d+) (\d+) (\d+) \$(\d+)/ig
Here are some samples with expected outputs:
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL
output: array([0] => 0.91,
[1] => 0.45,
[2] => 0.69,
[3] => 58,
[4] => 47,
[5] => 45,
[6] => 23,
[7] => 83,
[8] => 90,
[9] => 595)
0.91 0.45 0.69 58 47 45 $595 NO IDL
output: array([0] => 0.91,
[1] => 0.45,
[2] => 0.69,
[3] => 58,
[4] => 47,
[5] => 45,
[5] => 595)
0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL
output: Does not match the pattern because we only want 3 of the first items to contain decimals.
This seems to split the last number into multiple numbers. Can't figure out whats going on.
I am using php preg_match method for this so would like not empty elements in the resulting array if possible. Thanks.
Upvotes: 1
Views: 882
Reputation: 163237
You might repeat the amount of numbers until you matched 45
which is the 6th number.
Explanation
(?:\d+\.\d+)(?: \d+\.\d+){2}
Match the number at the start (digit with an decimal part) 3 times(?: \d+){3}
Match a digit with a whitespace 3 times. That will match up till 45\s*
Match zero or more whitespace characters|
Or\G(?!^)
Assert the position at the end of the previous match using a negative lookahead to assert not start of the string(\d+)\s
Capture the digits and match the whitespace in a capturing group(?:\d+\.\d+)(?: \d+\.\d+){2}(?: \d+){3}\s*|\G(?!^)(\d+)\s
For example a demo to extract the 3 digits after 45:
Upvotes: 0
Reputation: 626748
You may validate the string with a positive lookahead triggered at the start of the string, and then match all numbers from the start up to the currency value once the validation succeeds:
'~(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))\s*\$?\K\d+(?:\.\d+)?~'
See the regex demo
Details
(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))
- either the end of the previous match (\G(?!^)
) or start of a string (^
) that is followed with
\d+\.\d+
- a space\d+\.\d+
- a space\d+
- 1+ digits(?:\.\d+)?
- an optional fractional part(?: \d+)*
- 0+ sequences of a space followed with 1+ digits
- space\$\d
- a $
and a digit.\s*
- 0+ whitespaces\$?
- an optional $
char\K
- match reset operator \d+(?:\.\d+)?
- an int/float number (1+ digits followed with an optional sequence of .
and 1+ digits).$strs = ['0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL','0.91 0.45 0.69 58 47 45 $595 NO IDL','0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL'];
$rx = '~(?:\G(?!^)|^(?=\d+\.\d+ \d+\.\d+ \d+(?:\.\d+)?(?: \d+)* \$\d))\s*\$?\K\d+(?:\.\d+)?~';
foreach ($strs as $s) {
echo "$s:\n";
if (preg_match_all($rx, $s, $matches)) {
print_r($matches[0]);
echo "---------\n";
} else {
echo "NO MATCH!!!\n---------\n";
}
}
Output:
0.91 0.45 0.69 58 47 45 23 83 90 $595 NO IDL:
Array
(
[0] => 0.91
[1] => 0.45
[2] => 0.69
[3] => 58
[4] => 47
[5] => 45
[6] => 23
[7] => 83
[8] => 90
[9] => 595
)
---------
0.91 0.45 0.69 58 47 45 $595 NO IDL:
Array
(
[0] => 0.91
[1] => 0.45
[2] => 0.69
[3] => 58
[4] => 47
[5] => 45
[6] => 595
)
---------
0.91 0.45 0.69 0.63 58 47 45 $595 NO IDL:
NO MATCH!!!
---------
Upvotes: 1