Reputation: 2755
I am using PHPDocument
and DOMXPath
. I am trying to get the attribute with json type value. But I don't get the exact value. I could get the other attributes well but not this. The HTML looks like
<a href="URL" title="{tt4438848=Nicholas Stoller (dir.), Seth Rogen, Rose Byrne, tt2567026=James Bobin (dir.), Mia Wasikowska, Johnny Depp, tt3498820=Anthony Russo (dir.), Chris Evans, Robert Downey Jr., tt2948356=Byron Howard (dir.), Ginnifer Goodwin, Jason Bateman, tt3385516=Bryan Singer (dir.), James McAvoy, Michael Fassbender, tt1985949=Clay Kaytis (dir.), Jason Sudeikis, Josh Gad, tt3068194=Whit Stillman (dir.), Kate Beckinsale, Chloë Sevigny, tt3799694=Shane Black (dir.), Russell Crowe, Ryan Gosling, tt3040964=Jon Favreau (dir.), Neel Sethi, Bill Murray, tt2241351=Jodie Foster (dir.), George Clooney, Julia Roberts}">X-Men: Apocalypse</a>
If I use echo $dom->getAttribute("href");
the output is URL
If I use echo $dom->getAttribute("title");
the output is Bryan Singer (dir.), James McAvoy, Michael Fassbender
I cannot get the exact attribute value.
Edit link phpfiddle.org/main/code/dvj5-zf0q
Can anyone help?? I am new to PHPDOM. Thanks in advance
Upvotes: 0
Views: 51
Reputation: 43199
To get the title attribute:
<?php
$html = <<<EOF
<html>
<a href="URL" title="{tt4438848=Nicholas Stoller (dir.), Seth Rogen, Rose Byrne, tt2567026=James Bobin (dir.), Mia Wasikowska, Johnny Depp, tt3498820=Anthony Russo (dir.), Chris Evans, Robert Downey Jr., tt2948356=Byron Howard (dir.), Ginnifer Goodwin, Jason Bateman, tt3385516=Bryan Singer (dir.), James McAvoy, Michael Fassbender, tt1985949=Clay Kaytis (dir.), Jason Sudeikis, Josh Gad, tt3068194=Whit Stillman (dir.), Kate Beckinsale, Chloë Sevigny, tt3799694=Shane Black (dir.), Russell Crowe, Ryan Gosling, tt3040964=Jon Favreau (dir.), Neel Sethi, Bill Murray, tt2241351=Jodie Foster (dir.), George Clooney, Julia Roberts}">X-Men: Apocalypse</a>
</html>
EOF;
$dom = new DOMDocument();
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {
$title = $link->getAttribute('title');
echo $title;
}
?>
Be aware though that the title
does not hold a json string but some custom implementation.
See a demo on ideone.com.
\w+=((?:(?!(?:, tt)).)+)
Broken down to your problem this would be:
$regex = '~\w+=((?:(?!(?:, tt)).)+)~';
foreach ($links as $link) {
preg_match_all($regex, $link->getAttribute('title'), $actors);
print_r($actors);
}
See a demo for this one on ideone.com as well.
Upvotes: 2