Reputation: 5106
I'm trying to parse throught http://whatismyip.com page and get my location (state and country). The data seems to be inside <table class="table">
tags, so i'm looking for "table".
But I get a mistake Warning: file_get_contents(https://whatismyip.com): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in C:\xampp4\htdocs\scraping\libs\simple_html_dom.php on line 1081
Can't figure out what's wrong.
<?php
require_once('libs/simple_html_dom.php');
$html=new simple_html_dom();
$html->load_file('https://whatismyip.com');
$element=$html->find("table");
?>
Upvotes: 4
Views: 11139
Reputation: 111
try changing the user agent using below command -
ini_set("user_agent","Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0");
it will work fine then!
Upvotes: 3
Reputation: 5202
basicly your exemple it good but the mistakes here is simple html dom classes not working with Https so try another method
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, "https://whatismyip.com");
curl_setopt($curl, CURLOPT_REFERER, "https://whatismyip.com");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201');
$str = curl_exec($curl);
curl_close($curl);
and then use your code
$html->load_file($str);
$element=$html->find("table");
Edit Adding User-agent to emulate a real navigator (thanks to ShiraNai7)
Upvotes: 5
Reputation: 6560
That website is checking the User-Agent
header of the request but PHP doesn't send any (by default). You'll have to "impersonate" a browser:
$context = stream_context_create(array(
'http' => array(
'header' => array('User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201'),
),
));
$html = file_get_contents('http://whatismyip.com/', false, $context);
// do what you want with the $html
A better (and faster) option would be to use some library for this. I've used GeoIP2-php before but I'm sure there are more.
Upvotes: 10