CSchulz
CSchulz

Reputation: 11020

Correct grouping with regex

I have a regex which includes a list of commands. But I don't know what kind of parameter behind it is, so it can be a string or a number or nothing.
And there can be the possibility, that I don't know the command.

In my first version there wasn't any strings, so (abc|def|[a-z]+)([0-9]*) works fine. But now I want to allow strings, too. (abc|def|[a-z]+)([0-9]*|[a-z]*) doesn't work.

String 1: abc20def20ghi20
String 2: abcdddef20ghi20
String 3: abcdddef2d0ghi20abcdd

String 1:
Example with regex 1: abc20***def20***ghi20
Example with regex 2: abc20***def20***ghi20

String 2:
Example with regex 1: abc***dddef20***ghi20
Example with regex 2: abc***dddef20***ghi20

I want to get following result: abc20***def20***ghi20 and abcdd***def20***ghi20

Thanks for your help.

Upvotes: 0

Views: 90

Answers (2)

reko_t
reko_t

Reputation: 56430

Based on your latest comment, maybe this will do the trick for you:

(abc|def)(\d+|(?:(?!(?1))[a-z])+)?|((?:(?!(?1))[a-z])+)((?2))?

EDIT. Oops, meant to edit my previous answer instead of posting new one.

TEST CASE:

<?php

$r = '#(abc|def)(\d+|(?:(?!(?1))[a-z])+)?|((?:(?!(?1))[a-z])+)((?2))?#';
$s1 = 'abc20def20ghi20';
$s2 = 'abcdddef20ghi20';
$s3 = 'abcdddef2d0ghi20abcdd';

preg_match_all($r, $s1, $m1);
preg_match_all($r, $s2, $m2);
preg_match_all($r, $s3, $m3);
var_dump($m1[0], $m2[0], $m3[0]);

Output:

array(3) {
  [0]=>
  string(5) "abc20"
  [1]=>
  string(5) "def20"
  [2]=>
  string(5) "ghi20"
}
array(3) {
  [0]=>
  string(5) "abcdd"
  [1]=>
  string(5) "def20"
  [2]=>
  string(5) "ghi20"
}
array(5) {
  [0]=>
  string(5) "abcdd"
  [1]=>
  string(4) "def2"
  [2]=>
  string(2) "d0"
  [3]=>
  string(5) "ghi20"
  [4]=>
  string(5) "abcdd"
}

As you can see, it catches all parts from both strings correctly.

Upvotes: 1

reko_t
reko_t

Reputation: 56430

Do you always want to capture strings whose length is 5? If so, you can do this:

([a-z]{3})([0-9a-z]{2})

If not, maybe you can clarify that what exactly is the criteria to "cut" the string between "abcdd" and "def20"?

Upvotes: 0

Related Questions