Kenneth Poulsen
Kenneth Poulsen

Reputation: 949

DOMDocument, grab several values on a website

How can I get several values on a website using PHP (the value between div tags, value1, value2, value3 in the example below)?

I have been looking into DOMDocument, but getting confused.

Also, will it be possible to get the values without loading the website 3 times?

Example. I need to get 3 values (or more) from a website:

<div class="SomeUniqueClassName">value1</div>
<div class="AnotherUniqueClassName">value2</div>
<div class="UniqueClassName">value3</div>

This is what I have now, but it looks stupid and i'm not 100% sure what i'm doing:

$doc = new DOMDocument;

$doc->loadHTMLFile($url);

$xpath = new DOMXPath($doc);

$query1 = "//div[@class='SomeUniqueClassName']";
$query2 = "//div[@class='AnotherUniqueClassName']";
$query3 = "//div[@class='UniqueClassName']";

$entry1 = $xpath->query($query1);
$value 1 = var_dump($entry1->item(0)->textContent);

$entry2 = $xpath->query($query2);
$value 2 = var_dump($entry2->item(0)->textContent);

$entry3 = $xpath->query($query3);
$value 3 = var_dump($entry3->item(0)->textContent);

Upvotes: 0

Views: 29

Answers (2)

Professor Abronsius
Professor Abronsius

Reputation: 33823

With the XPath expression you could try using the "contains" qualifier and look for the unique class if it follows your example

$dom = new DOMDocument;
$dom->loadHTMLFile( $url );
$xp = new DOMXPath( $dom );


$query="//div[ contains( @class,'UniqueClass' ) ]";
$col=$xp->query( $query );
if( $col && $col->length > 0 ){
    foreach( $col as $node ){
        echo $node->item(0)->nodeValue;
    }
}

Or modify the XPath expression to search for multiple conditions, like:

$query="//div[@class='UniqueClass1'] | //div[@class='UniqueClass2'] | //div[@class='UniqueClass3']";
$col=$xp->query( $query );
if( $col && $col->length > 0 ){
    foreach( $col as $node ){
        echo $node->item(0)->nodeValue;
    }
}

Upvotes: 0

Anyone_ph
Anyone_ph

Reputation: 616

You should use CURL for this :

 $curl = curl_init();    
         curl_setopt($curl, CURLOPT_URL,'http://theurlhere.com');
         //Optional, if the target URL use SSL
         curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
         curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
         curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
         curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);

$parse = curl_exec($curl);
         curl_close($curl);

    preg_match_all('/<div class="uniqueClassName([0-9])">(.*)<\/div>/', $parse, $value);

    print_r($value);

Upvotes: 2

Related Questions