Reputation: 2217
I am trying to change all the links of a html with php preg_replace. All the uris have the following form
http://example.com/page/58977?forum=60534#comment-60534
I want to change it to:
http://example.com/60534
which means removing everything after "page" and before "comment-", including these two strings.
I tried the following, but it returns no changes:
$result = preg_replace("/^.page.*.comment-.$/", "", $html);
but it seems that my regex syntax is not correct, as it returns the html unchanged. Could you please help me with this?
Upvotes: 0
Views: 380
Reputation: 17910
Alternate way without using regular expression.
Uses parse_url()
<?php
$url = 'http://example.com/page/58977?forum=60534#comment-60534';
$array = parse_url($url);
parse_str($array['query'], $query);
$http = ($array['scheme']) ? $array['scheme'].'://' : NULL;
echo $http.$array['host'].'/'.$query['forum'];
?>
Demo: http://codepad.org/xB3kO588
Upvotes: 0
Reputation: 6094
You probably simply need this: http://php.net/manual/en/function.parse-url.php This function parses a URL and returns an associative array containing any of the various components of the URL that are present.
Upvotes: 2
Reputation: 838336
The ^
is an anchor that only matches the start of the string, and $
only matches at the end. In order to match you should not anchor the regular expression:
$result = preg_replace("/page.*?comment-/", "", $html);
Note that this could match things that are not URLs. You may want to be more specific as to what will be replaced, for example you might want to only replace links starting with either http:
or https:
and that don't contain whitespace.
Upvotes: 6