x_maras
x_maras

Reputation: 2217

preg_replace how change part of uri

I am trying to change all the links of a html with php preg_replace. All the uris have the following form

http://example.com/page/58977?forum=60534#comment-60534

I want to change it to:

http://example.com/60534

which means removing everything after "page" and before "comment-", including these two strings.

I tried the following, but it returns no changes:

$result = preg_replace("/^.page.*.comment-.$/", "", $html);

but it seems that my regex syntax is not correct, as it returns the html unchanged. Could you please help me with this?

Upvotes: 0

Views: 380

Answers (3)

Muthu Kumaran
Muthu Kumaran

Reputation: 17910

Alternate way without using regular expression.

Uses parse_url()

<?php    
    $url = 'http://example.com/page/58977?forum=60534#comment-60534';
    $array = parse_url($url);
    parse_str($array['query'], $query);   
    $http = ($array['scheme']) ? $array['scheme'].'://' : NULL;    
    echo $http.$array['host'].'/'.$query['forum'];
?>

Demo: http://codepad.org/xB3kO588

Upvotes: 0

Dan K.K.
Dan K.K.

Reputation: 6094

You probably simply need this: http://php.net/manual/en/function.parse-url.php This function parses a URL and returns an associative array containing any of the various components of the URL that are present.

Upvotes: 2

Mark Byers
Mark Byers

Reputation: 838336

The ^ is an anchor that only matches the start of the string, and $ only matches at the end. In order to match you should not anchor the regular expression:

$result = preg_replace("/page.*?comment-/", "", $html);   

Note that this could match things that are not URLs. You may want to be more specific as to what will be replaced, for example you might want to only replace links starting with either http: or https: and that don't contain whitespace.

Upvotes: 6

Related Questions