Reputation: 59
When using Simple HTML DOM library I have faced a problem with some websites. When I tried to load the following url http://www.t-mobile.com/shop/phones/cell-phone-detail.aspx?cell-phone=HTC-One-S-Gradient-Blue&tab=reviews#BVRRWidgetID
My PHP code is:
<?php
include "simple_html_dom.php";
$html=new simple_html_dom();
$url="http://www.t-mobile.com/shop/phones/cell-phone-detail.aspx?cell-phone=HTC-One-S- Gradient-Blue&tab=reviews#BVRRWidgetID";
$html->load_file($url);
echo $html;
?>
The php script gives no error but it shows the following content every time.
Unsupported Browser
It appears that you are viewing this page with an unsupported Web browser. This Web site works best with one of these supported browsers:
Microsoft Internet Explorer 5.5 or higher
Netscape Navigator 7.0 or higher
Mozilla Firefox 1.0 or higher
If you continue to view our site with your current browser, certain pages may not display correctly and certain features may not work properly for you.
What is the problem? Does Simple HTML DOM have a limitation? Is there any other way to solve this problem?
Upvotes: 0
Views: 578
Reputation: 1020
set the useragent
$context = stream();
stream($context, array('user_agent' => 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6\r\n'));
file_get_html('http://www.t-mobile.com/shop/phones/cell-phone-detail.aspx?cell-phone=HTC-One-S- Gradient-Blue&tab=reviews#BVRRWidgetID', 0, $context);
Upvotes: 0
Reputation: 461
Just setup your USERAGENT in simple_html_dom request:
# Creating useragent array
$useragent = array("http" => "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6");
# Creating a line from array
$useragent = stream_context_create($useragent);
# Starting Simple_HTML_Dom with our useragent
$html = file_get_html($urlCategory, $useragent)
So, our request will be from the newer browser than yours.
Upvotes: 1
Reputation: 54
Some websites are not allowed to scrap its content directly.
you can use curl fetch html content and then use load() of dom object.
i hope it work for you.
Upvotes: 1