Ali Ahmadi
Ali Ahmadi

Reputation: 89

PHP - file_get_contents doesn't work

How can I solve the problem of below code? This code gets all links in a website but it doesn't work on some website like the below one. How can I solve this problem?

<?php

    $html = file_get_contents('http://blogfa.com/members/updated.aspx');

    $dom = new DOMDocument();
    @$dom->loadHTML($html);

    // grab all the on the page
    $xpath = new DOMXPath($dom);
    $hrefs = $xpath->evaluate("/html/body//a");

    for ($i = 0; $i < $hrefs->length; $i++) {
        $href = $hrefs->item($i);
        $url = $href->getAttribute('href');
        echo $url . '<br />';
    }

?>

Upvotes: 0

Views: 468

Answers (2)

Kshitij Soni
Kshitij Soni

Reputation: 394

Actually You are gettting links..But there is a warning ..To Solve this U have to add one line .. I am getting this warning

E_WARNING : type 2 -- DOMDocument::loadHTML(): htmlParseStartTag: misplaced <body> tag in Entity, line: 20 -- at line 6

Solution :

<?php
$html = file_get_contents('http://blogfa.com/members/updated.aspx');
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html);

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");

for ($i = 0; $i < $hrefs->length; $i++) {
    $href = $hrefs->item($i);
    $url = $href->getAttribute('href');
    echo $url . '<br />';
}
?>

libxml_use_internal_errors(true); is used for disable the warning..

Upvotes: 1

Timothy
Timothy

Reputation: 2003

When I run your code I get the following PHP error:

E_WARNING : type 2 -- DOMDocument::loadHTML(): htmlParseStartTag: misplaced <body> tag in Entity, line: 20 -- at line 6

If you look at the sourcecode of your page at http://blogfa.com/members/updated.aspx, you'll see that the <body>-tag is opened twice.

Try removing the second <body>-tag. Other than this, your code seems to work.

Upvotes: 0

Related Questions