mathew
mathew

Reputation: 1200

How do I parse visitors by country info from alexa?

if you search alexa with any URL's you will get a detailed traffic information of the same. what I am looking into is I would like to parse Visitors by Country info from alexa.

example for google.com

url is - http://www.alexa.com/siteinfo/google.com.

on the Audience tab you can see:

Visitors by Country for Google.com

United States 35.0%

India 8.8%

China 4.1%

Germany 3.4%

United Kingdom 3.2%

Brazil 3.2%

Iran 2.8%

Japan 2.1%

Russia 2.0%

Italy 1.9%

Brazil 3.2%

Iran 2.8%

Japan 2.1%

Russia 2.0%

Italy 1.9%

Indonesia 1.7% //etc.

How can I get only these info from alexa.com?? I have tried with preg_match function but it is very difficult in this case....

Upvotes: 1

Views: 1055

Answers (1)

Narcis Radu
Narcis Radu

Reputation: 2547

If you don't want to use DOM and getElementById which is the most elegant solution in this case, you can try regexp:

$data = file_get_contents('http://www.alexa.com/siteinfo/google.com');
preg_match_all(
   '/<a href="\/topsites\/countries\/(.*)">(.*)<\/a>/mU',
   $data,
   $result,
   PREG_SET_ORDER
);

The DOM solution looks like:

$doc = new DomDocument;

$doc->loadHTMLFile('http://www.alexa.com/siteinfo/google.com');

$data = $doc->getElementById('visitors-by-country');

$my_data = $data->getElementsByTagName('div');

$countries = array();
foreach ($my_data as $node)
{
    foreach($node->getElementsByTagName('a') as $href)
    {
        preg_match('/([0-9\.\%]+)/',$node->nodeValue, $match);
        $countries[trim($href->nodeValue)] = $match[0]; 
    }
}    

var_dump($countries);

Upvotes: 3

Related Questions