php curl/xpath data based off text information?

Question

I know how to xpath and echo text off another website via tags like div id, class ,etc, using the below code. But, I don't know how to do it under more precise conditions, for example when trying to scrape and echo a bit of text that has no unique tag identifier like a div. This below code spits out scraped data.

$doc = new DOMDocument;

// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;

// Most HTML Developers are chimps and produce invalid markup...
$doc->strictErrorChecking = false;
$doc->recover = true;

$doc->loadHTMLFile('http://www.nbcnews.com/business');

$xpath = new DOMXPath($doc);

$query = "//div[@class='market']";

$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo trim($entry->textContent);  // use `trim` to eliminate spaces
}

In this below source code for an example, I want to pull the value "21,271.97". But there's no unique tag for this, no div id. Is it possible to pull this data by identifying a keyword in the < p> that never changes, for example "DJIA all time".

DJIA All Time, Record-High Close: June 9, 
2017 
(21,271.97)

Wondering if I could possibly replace this with something around the lines of $query = "//div[@class='market']"; $query = "//p['DJIA all time']";

Could this be possible?

I also wonder if using a loop with something like $query = "//p[='DJIA']";? could work, though I don't know how to use that exactly. Thanks!!

Nigel Ren · Accepted Answer

It would be good to have a play with an online XPath tester - I use https://www.freeformatter.com/xpath-tester.html#ad-output

$query = "//p[contains(text(),'DJIA')]";

Although if you use the page your after, I've found that the value seems to be the first record for...

$query = "//span[contains(@class,'market_price')]";

But the idea is the same in both cases, using contains(source,value) will match a set of nodes. In the first case the text() is the value of the node,the second looks for the specific class definition.

php curl/xpath data based off < p> text information?

Answers (2)

Related Questions

php curl/xpath data based off &lt; p&gt; text information?

Answers (2)

Related Questions

php curl/xpath data based off < p> text information?