Ky -
Ky -

Reputation: 32173

How can I detect if a given URL is the current one?

I need to detect if a provided URL matches the one currently navigated to. Mind you the following are all valid, yet semantically equivalent URLs:

The final function must return true if the given URL points back to the current page, or false if it does not. I do not have a list of expected URLs; this will be used for a client who just wants links to be disabled when they link to the current page. Note that I wish to ignore parameters, as these do not indicate the current page on this site. I got as far as using the following regex:


where https?, www, \.example\.com, \/path\/to\/page, and index.php are dynamically detected with $_SERVER["PHP_SELF"] and made into regex form, but that doesn't match the relative URLs like ../../to/page.

EDIT: I got a bit farther with the regex: now I'd just need PHP to dynamically create the regex for any given page.

Upvotes: 0

Views: 654

Answers (4)


Reputation: 46896

First off, there is no way to predict the total list of valid URLs that will result in display of the current page, since you can't predict (or control) external links that might link back to the page. What if someone uses TinyURL or A regex will not cut the mustard.

If what you need is to insure that a link does not result in the same page, then you need to TEST it. Here's a basic concept:

  1. Every page has a unique ID. Call it a serial number. It should be persistent. The serial number should be embedded somewhere predictable (though perhaps invisibly) within the page.

  2. As the page is created, your PHP will need to walk through all the links for each page, visit each one, and determine whether the link resolves to a page with a serial number that matches the calling page's serial number.

  3. If the serial number does not match, display the link as a link. Otherwise, display something else.

Obviously, this will be an arduous, resource-intensive process for page production. You really don't want to solve your problem this way.

With your "ultimate goal" comment in mind, I suspect your best approach is to be approximate. Here are some strategies...

First option is also the simplest. If you're building a content management system that USUALLY creates links in one format, just support that format. Wikipedia's approach works because a [[link]] is something THEY generate, so THEY know how it's formatted.

Second is more the direction you've gone with your question. The elements of a URL are "protocol", "host", "path" and "query string". You can break them out into a regex, and possibly get it right. You've already stated that you intend to ignore the query string. So ... start with '((https?:)?//(www\.)?example\.com)?' . $_SERVER['SCRIPT_NAME'] and add endings to suit. Other answers are already helping you with this.

Third option is quite a bit more complex, but gives you more fine-grained control over your test. As with the last option, you have the various URL elements. You can test for the validity of each without using a regex. For example:

$a = array();                                 // init array for valid URLs

// Step through each variation of our path...
foreach([$_SERVER['SCRIPT_NAME'], $_SERVER['REQUEST_URI']] as $path) {

  // Step through each variation of our host...
  foreach ([$_SERVER['HTTP_HOST'], explode(".", $_SERVER['HTTP_HOST'])[0]] as $server) {
    // Step through each variation of our protocol...
    foreach (['https://','http://','//'] as $protocol) {
      // Set the URL as a key.
      $a[ $protocol . $server . $path ] = 1;

  // Also for each path, step through directories and parents...
  $apath=explode('/', $path);                 // turn the path into an array
  unset($apath[0]);                           // strip the leading slash
  for( $i = 1; $i <= count($apath); $i++ ) {
    if (strlen($apath[$i])) {
      $a[ str_repeat("../", 1+count($apath)-$i) . implode("/", $apath) ] = 1;
                                              // add relative paths

  $a[ "./" . implode("/", $apath) ] = 1;      // add current directory


Then simply test whether the link (minus its query string) is an index within the array. Or adjust to suit; I'm sure you get the idea.

I like this third solution the best.

Upvotes: 2

Ky -
Ky -

Reputation: 32173

As I have been paid to work on this for the last couple days, I wasn't just sitting around waiting for an answer. I've come up with one that works in my test platform; what does everyone else think? It feels a little bloated, but also feels bulletproof.

Debug echoes left in in case you wanna echo out some stuffs.

global $debug;$debug = false; // toggle debug echoes and var_dumps

 * Returns a boolean indicating whether the given URL is the current one.
 * @param $otherURL the other URL, as a string. Can be any URL, relative or canonical. Invalid URLs will not match.
 * @return true iff the given URL points to the same place as the current one
function isCurrentURL($otherURL)
{global $debug;

    if ($thisURL == $otherURL) // unlikely, but possible. Might as well check.
        return true;

    // BEGIN Parse other URL
    $otherProtocol = parse_url($otherURL);
    $otherHost = $otherProtocol["host"] or null; // if $otherProtocol["host"] is set and is not null, use it. Else, use null.
    $otherDomain = explode(".", $otherHost) or $otherDomain;
    $otherSubdomain = array_shift($otherDomain); // subdom only
    $otherDomain = implode(".", $otherDomain); // domain only
    $otherFilepath = $otherProtocol["path"] or null;
    $otherProtocol = $otherProtocol["scheme"] or null;
    // END Parse other URL

    // BEGIN Get current URL
    #if($debug){echo '$_SERVER == '; var_dump($_SERVER);}
    $thisProtocol = $_SERVER["HTTP_X_FORWARDED_PROTO"]; // http or https
    $thisHost = $_SERVER["HTTP_HOST"]; // subdom or subdom.domain.tld
    $thisDomain = explode(".", $thisHost);
    $thisSubdomain = array_shift($thisDomain); // subdom only
    $thisDomain = implode(".", $thisDomain); // domain only
    if ($thisDomain == "")
        $thisDomain = $otherDomain;
    $thisFilepath = $_SERVER["PHP_SELF"]; // /path/to/file.php
    $thisURL = "$thisProtocol://$thisHost$thisFilepath";
    // END Get current URL

    if($debug)echo"Current URL is $thisURL ($thisProtocol, $thisSubdomain, $thisDomain, $thisFilepath).\r\n";
    if($debug)echo"Other URL is $otherURL ($otherProtocol, $otherHost, $otherFilepath).\r\n";

    $thisDomainRegexed = isset($thisDomain) && $thisDomain != null && $thisDomain != "" ? "(\." . str_replace(".","\.",$thisDomain) . ")?" : ""; // prepare domain for insertion into regex
    //                                                                                                      v this makes the last slash before index.php optional
    $regex = "/^(($thisProtocol:)?\/\/$thisSubdomain$thisDomainRegexed)?" . preg_replace('/index\\\..+$/i','?(index\..+)?', str_replace(array(".", "/"), array("\.", "\/"), $thisFilepath)) . '$/i';

    if($debug)echo "\r\nregex is $regex\r\nComparing regex against $otherURL";
    if (preg_match($regex, $otherURL))
        if($debug)echo"\r\n\tIt's a match! Returning true...\r\n}\r\n-->";
        return true;
        if($debug)echo"\r\n\tOther URL is NOT a fully-qualified URL in this subdomain. Checking if it is relative...";
        if($otherURL == $thisFilepath) // somewhat likely
            if($debug)echo"\r\n\t\tOhter URL and this filepath are an exact match! Returning true...\r\n}\r\n-->";
            return true;
            if($debug)echo"\r\n\t\tFilepath is not an exact match. Testing against regex...";
            $regex = regexFilepath($thisFilepath);
            if($debug)echo"\r\n\t\tNew Regex is $regex";
            if($debug)echo"\r\n\t\tComparing regex against $otherFilepath...";
            if (preg_match($regex, $otherFilepath))
                if($debug)echo"\r\n\t\t\tIt's a match! Returning true...\r\n}\r\n-->";
                return true;
    if($debug)echo"\r\nI tried my hardest, but couldn't match $otherURL to $thisURL. Returning false...\r\n}\r\n-->";
    return false;

 * Uses the given filepath to create a regex that will match it in any of its relative representations.
 * @param $path the filepath to be converted
 * @return a regex that matches a all relative forms of the given filepath
function regexFilepath($path)
{global $debug;

    $filepathArray = explode("/", $path);
    if (count($filepathArray) == 0)
        throw new Exception("given parameter not a filepath: $path");
    if ($filepathArray[0] == "") // this can happen if the path starts with a "/"
        array_shift($filepathArray); // strip the first element off the array
    $isIndex = preg_match("/^index\..+$/i", end($filepathArray));
    $filename = array_pop($filepathArray);


$ret = '';
foreach($filepathArray as $i)
    $ret = "(\.\.\/$ret$i\/)?"; // make a pseudo-recursive relative filepath
if($debug)echo "\r\n$ret";
$ret = preg_replace('/\)\?$/', '?)', $ret); // remove the last '?' and add one before the last '\/'
if($debug)echo "\r\n$ret";
$ret = '/^' . ($ret == '' ? '\.\/' : "((\.\/)|$ret)") . ($isIndex ? '(index\..+)?' : str_replace('.', '\.', $filename)) . '$/i'; // if this filepath leads to an index.php (etc.), then that filename is implied and irrelevant.


This seems to match everything I need it to match, and not what I don't need it to.

Upvotes: 0


Reputation: 786289

You can use this approach:

function checkURL($me, $s) {
   $dir = dirname($me) . '/';
   // you may need to refine this
   $s = preg_filter(array('~^//~', '~/$~', '~\?.*$~', '~\.\./~'),
                    array('', '', '', $dir), $s);
   // parse resulting URL
   $url = parse_url($s);
   // match parsed URL's path with self
   return ($url['path'] === $me);

// your page's URL with stripped out .php    
$me = str_replace('.php', '', $_SERVER['PHP_SELF']);

// assume this is the URL you are matching against
$s = '../page/';

// compare $me with $s
$ret = checkURL($me, $s);


Live Demo:

Upvotes: 0


Reputation: 76666

A regex isn't actually necessary to strip off all the query parameters. You could use strok():

$url = strtok($url, '?');

And, to check the output for your URL array:

$url_list = <<<URL    

$urls = explode("\n", $url_list);
foreach ($urls as $url) {
    $url = strtok($url, '?'); // remove everything after ?
    echo $url."\n";

As a function (could be improved):

function checkURLMatch($url, $url_array) {
    $url = strtok($url, '?'); // remove everything after ?
    if( in_array($url, $url_array)) {
        // url exists array
        return True;
    } else {
        // url not in array
        return False;

See it live!

Upvotes: 0

Related Questions