Reputation: 1337
As much as I've tried I can't seem to find the correct regex to locate what I'm after here.
I only want to select the first instance of the url that matches the domain www.myweb.com from the following...
Some text https://www.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr
I need to completely ignore the second url www.adifferentsite.com and only work with the first one that matches www.myweb.com, ignoring any other possible instances of www.myweb.com
Once the first matching domain is discovered I need to store the rest of the url that comes after it...
page/cat/323123442321-rghe432
...into a new variable $newvar, so...
$newvar = 'page/cat/323123442321-rghe432';
I'm trying :
return preg_replace_callback( '/http://www.myweb.com/\/[0-9a-zA-Z]+/', array( __CLASS__, 'my_callback' ), $newvar );
I've read tons of documents on how to detect url's but can't find anything about detecting a specific url.
I really can't grasp how to formulate regex so this formula is incorrect. Any help would be greatly appreciated.
EDIT Edited the question to be a bit more specific and hopefully a bit easier to resolve.
Upvotes: 2
Views: 1650
Reputation: 627410
You can use a preg_replace_callback
and pass an array into the anonymous function (or just your custom callback function) to fill it with all the necessary URL parts.
Here is a demo:
$rests = array();
$re = '~\b(https?://)www\.myweb\.com/(\S+)~';
$str = "Some text https://www.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr";
echo $result = preg_replace_callback($re, function ($m) use (&$rests) {
array_push($rests, $m[2]);
return $m[1] . "embed.myweb.com/" . $m[2];
}, $str) . PHP_EOL;
print_r($rests);
Results:
Some text https://embed.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr
Array
(
[0] => page/cat/323123442321-rghe432
)
A couple of words:
'~\b(https?://)www\.myweb\.com/(\S+)~'
has ~
as a regex delimiter, so you do not have to escape /
\\S
\b(https?://)
(that matches a whole word http
or https
followed by ://
) and (\S+)
(that matches 1 or more non-whitespace characters). These capturing groups are marked with (...)
in the pattern and can be accessed via $matches[n]
where n is the id of the capturing group.UPDATE
If you only need to replace the first occurrence of the URL, pass the limit argument to the preg_replace_callback
:
$rest = "";
$re = '~\b(https?://)www\.myweb\.com/(\S+\b)~';
$str = "Some text https://www.myweb.com/page/cat/323123442321-rghe432, another http://www.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr";
echo $result = preg_replace_callback($re, function ($m) use (&$rest) {
$rest = $m[2];
return $m[1] . "embed.myweb.com/" . $m[2];
}, $str, 1) . PHP_EOL;
//-LIMIT ^ - HERE -
echo $rest;
Upvotes: 2