Stanley
Stanley

Reputation: 569

php dom scraping - best method for grabbing product prices

I'm using simpleHtmlDom to do some basic screen scraping. I'm having some problems with grabbing product prices though. Sometimes I can get it to work, sometimes I can't. Also, sometimes I'm getting multiple prices... say for example the website has something like "normally $100... now $79.99" Any suggestions out there? Currently, I'm using this:

$prices = array();
$prices[] = $html->find("[class*=price]", 0)->innertext;
$prices[] = $html->find("[class*=msrp]", 0)->innertext;
$prices[] = $html->find("[id*=price]", 0)->innertext;
$prices[] = $html->find("[id*=msrp]", 0)->innertext;
$prices[] = $html->find("[name*=price]", 0)->innertext;
$prices[] = $html->find("[name*=msrp]", 0)->innertext;

One website that I have no idea of how to grab the price from is Victoria Secret.... the price looks like it's just floating around in random HTML.

Upvotes: 0

Views: 860

Answers (1)

pguardiario
pguardiario

Reputation: 54984

First of all, don't use simplehtmldom. Use the built in dom functions or a library that's based on them. If you want to extract all prices from a page you could try something like this:

$html = "<html><body>normally $100... now $79.99</body></html>";
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DomXpath($dom);

foreach($xpath->query('//text()[contains(.,"$")]') as $node){
    preg_match_all('/(\$[\d,.]+)/', $node->nodeValue, $m);
    print_r($m);
}

Upvotes: 1

Related Questions