itsliamoco
itsliamoco

Reputation: 1048

Reduce URL strings with no duplicates

I have an array that looks like the following...

$urls = array(
    "http://www.google.com",
    "http://www.google.com/maps",
    "http://www.google.com/mail",
    "https://drive.google.com",
    "https://www.youtube.com",
    "https://www.youtube.com/feed/subscriptions",
    "https://www.facebook.com/me",
    "https://www.facebook.com/me/friends"
);

I find this hard to explain but I want to break this array down to only show the reduced URLs with no duplicates, so it looks like this...

$urls = array(
    "http://www.google.com",
    "https://drive.google.com",
    "https://www.youtube.com",
    "https://www.facebook.com/me"
);

Notice the last URL in the second array still has it's path. This is because I want still want to show the lowest level paths

Upvotes: 0

Views: 88

Answers (3)

AD7six
AD7six

Reputation: 66168

Just sort the array in reverse order, and create an array indexed by host:

$urls = array(
    "http://www.google.com",
    "http://www.google.com/maps",
    "http://www.google.com/mail",
    "https://drive.google.com",
    "https://www.youtube.com",
    "https://www.youtube.com/feed/subscriptions",
    "https://www.facebook.com/me",
    "https://www.facebook.com/me/friends"
);

rsort($urls);

$return = []; 
foreach($urls as $url) {
        $host = parse_url($url, PHP_URL_HOST);
        $return[$host] = $url;
}
$return = array_values($return); // To remove array keys, if desired.

The reverse-ordered urls array would be:

Array
(
    [0] => https://www.youtube.com/feed/subscriptions
    [1] => https://www.youtube.com
    [2] => https://www.facebook.com/me/friends
    [3] => https://www.facebook.com/me
    [4] => https://drive.google.com
    [5] => http://www.google.com/maps
    [6] => http://www.google.com/mail
    [7] => http://www.google.com
)

Since the last entry (per host name) in the sorted array is the one that you want, and it deliberately clobbers any existing array value, this would output:

Array
(
    [www.youtube.com] => https://www.youtube.com
    [www.facebook.com] => https://www.facebook.com/me
    [drive.google.com] => https://drive.google.com
    [www.google.com] => http://www.google.com
)

Upvotes: 1

davcs86
davcs86

Reputation: 3935

Based on @Tim's answer

foreach ($urls as &$url) {
    $url_parts = parse_url($url);
    $url = $url_parts["scheme"]."://".$url_parts["host"];
}

$urls = array_unique($urls);

Upvotes: 3

hamed
hamed

Reputation: 8033

Try this:

$result = array();
array_push($result, $urls[0])
for($i=1; $i<count($urls); $i++)
{
    $repeat = false;
    foreach($result as $res)
    {
        if(strpos($urls[i], $res))
        {
            $repeat = true;
            break;
        }
    }
    if(!repeat)
       array_push($result, $urls[i])
}

return $result;

Upvotes: 0

Related Questions