Reputation: 195
Regex validation can receive next samples of strings:
t/E/s/t
t/E/s/t/
t/E/s/t/////...
t/E/s/t/////?page=10
t/E/s/t/////?page=10/
t/E/s/t/////?page=10////...
I need to split the string to the parts:
1. t/E/s/t
2. ?page=10////...
I have wrote the regex: ^(.*[^\/])\/+(\?.*)$
The problem that it does not work if the text string does not contain part of "?page=10///...". To make valid verification for string without "?page..." part i need second validation string: ^(.*[^\/])\/+$
I want to have only one validation rule.
Any ideas how to combine them?
Upvotes: 0
Views: 81
Reputation: 3454
It would be nice if something like /(.*[^\/])\/*(\?.*)?/
worked. But the problem is that the regex engine will find the best possible match for (.*[^\/])\/*
, even if this means matching (\?.*)?
against the empty string.*
You could do the following:
/(.*[^\/])\/*(\?.*)|(.*[^\/])/
This is slightly unsatisfactory in that you get 3 capture groups even though you only wanted 2. So you could do this instead, if (the version of) the language you're using allows the (?|...)
construct:
/(?|(.*[^\/])\/*(\?.*)|(.*[^\/]))/
*More generally, suppose the regex engine is faced with a regex /AB/
. The match it returns will contain the best possible match for /A/
(by which I mean the best match that can actually be extended to a match for /AB/
). To put it another way, it doesn't backtrack into A
until it's finished searching for matches for B
.
Upvotes: 1
Reputation: 41848
Is this what you are looking for?
<?php
$strings = array(
"t/E/s/t",
"t/E/s/t/",
"t/E/s/t/////...",
"t/E/s/t/////?page=10",
"t/E/s/t/////?page=10/",
"t/E/s/t/////?page=10////...");
$regex ='~(?<=t/E/s/t)/+~';
foreach($strings as $str) {
print_r(preg_split($regex,$str));
echo "<br />";
}
Output:
Array ( [0] => t/E/s/t )
Array ( [0] => t/E/s/t [1] => )
Array ( [0] => t/E/s/t [1] => ... )
Array ( [0] => t/E/s/t [1] => ?page=10 )
Array ( [0] => t/E/s/t [1] => ?page=10/ )
Array ( [0] => t/E/s/t [1] => ?page=10////... )
Upvotes: 0
Reputation: 20486
As a quick side note, I used ~
instead of /
for delimiters so your /
don't need to be escaped. Also, I used a character class for the question mark ([?]
) instead of having to escape it (\?
)...this is just personal preference for readability.
First we capture the literal string t/E/s/t
. Then we match 0+ /
s (if there needs to be a /
in between t/E/s/t
and ?
, then change the *
to +
for 1+). Finally we capture the question mark followed by the rest of the line ([?].*
). This is made optional with the trailing ?
, so that if your string does not have the ?page=10
it will still be matched with an empty second capture.
~(t/E/s/t)/*([?].*)?~
Upvotes: 0