Theramax
Theramax

Reputation: 195

How to load external HTML file into Javascript document object

I am looking for the javascript method analogous to PHP's DOMDocument->loadHTMLFILE(), so that I can parse an external html file's contents and extract images. Right now i'm doing it via ajax, which is too slow.

Here is the PHP i use to scrape images, it works. I simply want to do the same thing but browser side so that its faster.

if(isset($_POST['link']) && $_POST['link'] !== ""){
    //extract relevant article info from link
    $sourceArray = array();
    $sizeArray = array();
    $link = $_POST['link'];
    //generate new DOMdoc
    $article = new DOMDocument;
    $article ->loadHTMLFile($link);
    //get the largest image
    $images = $article->getElementsByTagName("img");
    foreach($images as $image){
        $source = $image->getAttribute("src");
        if(strpos($source, "http://") !== false){
            $sizeProfile = getimagesize($source);
            $imgArea = $sizeProfile[0] * $sizeProfile[1];
            if($imgArea > 100){
                array_push($sizeArray, $imgArea);
                array_push($sourceArray, $source);
            }
        }
    }
    array_multisort($sizeArray, SORT_DESC, $sourceArray);
    $sourceHTML = "";
    $i = 0;
    foreach($sourceArray as $source){
        $id = 'image'.$i;
        $sourceHTML .= '<img id="'.$id.'" class="notSelectedPicture" src="'.$source.'" onclick="toggleSelectedPicture(\''.$id.'\');" alt="alt">';
        $i++;
    }
    echo $sourceHTML;
    exit();
}

Upvotes: 2

Views: 1519

Answers (1)

Theramax
Theramax

Reputation: 195

The ajax solution works for this purpose. As a client-side language JS does not seem to be capable of getting and parsing external html files in the way that PHP is. In order to cut down on loading time, one should focus on the efficiency of the dom parsing code that the ajax posts to.

Upvotes: 1

Related Questions