Reputation: 4678
Given the following HTML code snippet:
<div class="item">
large
<span class="some-class">size</span>
</div>
I'm looking for the best way to extract the string "large" using Symfony's Crawler.
$crawler = new Crawler($html);
Here I could use $crawler->html()
then apply a regex search. Is there a better solution?
Or how would you do it exactly?
Upvotes: 2
Views: 1142
Reputation: 4678
I've just found a solution that looks the cleanest to me:
$crawler = new Crawler($html);
$result = $crawler->filterXPath('//text()')->text();
Upvotes: 4
Reputation: 7596
$crawler = new Crawler($html);
$node = $crawler->filterXPath('//div[@class="item"]');
$domElement = $node->getNode(0);
foreach ($node->children() as $child) {
$domElement->removeChild($child);
}
dump($node->text()); die();
After you have to trim whitespace.
Upvotes: 0
Reputation:
This is a bit tricky as the text that you're trying to get is a text node that the DOMCrawler
component doesn't (as far as I know) allow you to extract. Thankfully DOMCrawler is just a layer over the top of PHP's DOM classes which means you could probably do something like:
$crawler = new Crawler($html);
$crawler = $crawler->filterXPath('//div[@class="item"]');
$domNode = $crawler->getNode(0);
$text = null;
foreach ($domNode->children as $domChild) {
if ($domChild instanceof \DOMText) {
$text = $domChild->wholeText;
break;
}
}
This wouldn't help with HTML like:
<div>
text
<span>hello</span>
other text
</div>
So you would only get "text", not "text other text" in this instance. Take a look at the DOMText
documentation for more details.
Upvotes: 0