Reputation: 195
I am looking for the javascript method analogous to PHP's DOMDocument->loadHTMLFILE(), so that I can parse an external html file's contents and extract images. Right now i'm doing it via ajax, which is too slow.
Here is the PHP i use to scrape images, it works. I simply want to do the same thing but browser side so that its faster.
if(isset($_POST['link']) && $_POST['link'] !== ""){
//extract relevant article info from link
$sourceArray = array();
$sizeArray = array();
$link = $_POST['link'];
//generate new DOMdoc
$article = new DOMDocument;
$article ->loadHTMLFile($link);
//get the largest image
$images = $article->getElementsByTagName("img");
foreach($images as $image){
$source = $image->getAttribute("src");
if(strpos($source, "http://") !== false){
$sizeProfile = getimagesize($source);
$imgArea = $sizeProfile[0] * $sizeProfile[1];
if($imgArea > 100){
array_push($sizeArray, $imgArea);
array_push($sourceArray, $source);
}
}
}
array_multisort($sizeArray, SORT_DESC, $sourceArray);
$sourceHTML = "";
$i = 0;
foreach($sourceArray as $source){
$id = 'image'.$i;
$sourceHTML .= '<img id="'.$id.'" class="notSelectedPicture" src="'.$source.'" onclick="toggleSelectedPicture(\''.$id.'\');" alt="alt">';
$i++;
}
echo $sourceHTML;
exit();
}
Upvotes: 2
Views: 1519
Reputation: 195
The ajax solution works for this purpose. As a client-side language JS does not seem to be capable of getting and parsing external html files in the way that PHP is. In order to cut down on loading time, one should focus on the efficiency of the dom parsing code that the ajax posts to.
Upvotes: 1