DannyDj
DannyDj

Reputation: 58

Simple html dom - other result than expected

I try to retrieve info from a webpage using simple_html_dom, like this:

<?PHP
include_once('dom/simple_html_dom.php');
$urlpart="http://w2.brreg.no/motorvogn/";
$url = "http://w2.brreg.no/motorvogn/heftelser_motorvogn.jsp?regnr=BR15597";
$html = file_get_html($url);

foreach($html->find('a') as $element) 
       if(preg_match('*dagb*',$element)) {
       $result=$urlpart.$element->href;

       $resultcontent=file_get_contents($result);
       echo $resultcontent;

       }

?>

The $result variable first gives me this URL: http://w2.brreg.no/motorvogn/dagbokutskrift.jsp?dgbnr=2011365320&embnr=0&regnr=BR15597

When accessing the above URL with my browser, i get the content i expect.

When retrieving the content with $resultcontent, i get a different result, where it says in norwegian "Invalid input".

Any ideas why?

Upvotes: 2

Views: 173

Answers (2)

Jenson M John
Jenson M John

Reputation: 5689

The problem is with your URL query parameter.

http://w2.brreg.no/motorvogn/dagbokutskrift.jsp?dgbnr=2011365320&embnr=0&regnr=BR15597

The string '&reg' in the URL will be converted to Symbol ® in file_get_contents function which stops you from getting the actual result.

You can use html_entity_decode function in line #11

$resultcontent=file_get_contents(html_entity_decode($result));

Upvotes: 1

Bryan P
Bryan P

Reputation: 4202

foreach($html->find('a') as $element) 
       if(preg_match('*dagb*',$element)) {
       $result=$urlpart.$element->href;
       $resultcontent=file_get_contents(html_entity_decode($result));
       echo $resultcontent;

       }

This should do the trick.

Upvotes: 1

Related Questions