john
john

Reputation: 175

How do I trim this

http://www.trafficestimate.com/,http://getclicky.com/,http://technotarget.com/find-out-who-is-visiting-your-site-website-traffic-tools/,http://pmetrics.performancing.com/

The above are sample websites for trimming. I want to extract only the domain names from above, for example: trafficestimate.com,getclicky.com,technotarget.com,performancing.com

How can I do this with PHP? I am talking about a lot more web addresses like this, not only the above one.

Upvotes: 1

Views: 162

Answers (7)

Bayasaa
Bayasaa

Reputation: 1

<?php
$input = explode(',', $input);
$urls = array();
foreach($input as $item){
   $url = parse_url($item);
   $urls[] = $item[host];
}
?>

Upvotes: 0

onteria_
onteria_

Reputation: 70567

Sure, let's see how this can be done. First, we need to break these URLs into individual components. We can do this by using the explode command:

$urls = "http://www.trafficestimate.com/,http://getclicky.com/,http://technotarget.com/find-out-who-is-visiting-your-site-website-traffic-tools/,http://pmetrics.performancing.com/";

$url_array = explode(",", $urls);

What this does is take the URLs you have, and put them into an array by separating them on the comma. Let's see what a sample result looks like:

Array
(
    [0] => http://www.trafficestimate.com/
    [1] => http://getclicky.com/
    [2] => http://technotarget.com/find-out-who-is-visiting-your-site-website-traffic-tools/
    [3] => http://pmetrics.performancing.com/
)

Nifty eh? Now then, the next step is to loop through all the results, which can be done with a simple foreach loop. But before we do, we need someplace to store the result domains. We declare an empty array:

$domains = array();

Now we can loop over the results:

$domains = array();
foreach($url_array as $url) {
  // actions here
}

So, what do we need to do for each result? We need the domain name. PHP actually has a nice function to parse urls called parse_url. The alternative to this is to use more complicated measures, so this works out nicely! Here is our updated code:

$domains = array();
foreach($url_array as $url) {
  $parsed_url = parse_url($url);
}

Now then, let's see what parse_url gives us:

Array
(
    [scheme] => http
    [host] => pmetrics.performancing.com
    [path] => /
)

Notice that host? It's the domain name we're trying to get a hold of. So we'll add that to our array of domains:

$domains = array();
foreach($url_array as $url) {
  $parsed_url = parse_url($url);
  $domains[] = $parsed_url['host'];
}

Now let's see what the result is:

Array
(
    [0] => www.trafficestimate.com
    [1] => getclicky.com
    [2] => technotarget.com
    [3] => pmetrics.performancing.com
)

That's it! $domain now holds all the domain names. If we want to print them separated by commas like above, we can use the implode command to do so:

echo implode(',', $domains);

Which gives us:

www.trafficestimate.com,getclicky.com,technotarget.com,pmetrics.performancing.com

And that's all there is too it! Here is the full code listing for your reference:

$urls = "http://www.trafficestimate.com/,http://getclicky.com/,http://technotarget.com/find-out-who-is-visiting-your-site-website-traffic-tools/,http://pmetrics.performancing.com/";

$url_array = explode(",", $urls);

$domains = array();
foreach($url_array as $url) {
  $parsed_url = parse_url($url);
  $domains[] = $parsed_url['host'];
}

echo implode(',', $domains);

Upvotes: 7

deceze
deceze

Reputation: 522500

$urls = 'http://www.trafficestimate.com/,http://getclicky.com/,http://technotarget.com/find-out-who-is-visiting-your-site-website-traffic-tools/,http://pmetrics.performancing.com/';
$hosts = array_map(function ($url) { return parse_url($url, PHP_URL_HOST); }, explode(',', $urls));

var_dump($hosts);

Note that this returns pmetrics.performancing.com for example, which is the correct way to do it though. There's no rule that says only the TLD and first subdomain are "the domain", the complete hostname is the domain.

Upvotes: 0

AjayR
AjayR

Reputation: 4179

Alternately you can use this function to get the domain only.

    function GetDomain($url)
{
$nowww = ereg_replace('www\.','',$url);
$domain = parse_url($nowww);
if(!empty($domain["host"]))
    {
     return $domain["host"];
     } else
     {
     return $domain["path"];
     }

}

Upvotes: 0

Mark
Mark

Reputation: 108567

Parse URL

Upvotes: 2

AjayR
AjayR

Reputation: 4179

<?php
// get host name from URL
preg_match("/^(http:\/\/)?([^\/]+)/i",
    "http://www.example.com/index.html", $matches);
$host = $matches[2];

// get last two segments of host name
preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);
echo "domain name is: {$matches[0]}\n";

/* Output is example.com */

?>

Upvotes: 1

Denis de Bernardy
Denis de Bernardy

Reputation: 78543

like so:

$input = explode(',', $input);

and then for each value:

$input[$k] = preg_replace('/^https?://(?:www\.)?/i', '', $input[$k]);

Upvotes: 2

Related Questions