Reputation: 1264
I have being messing about with DOM XPath stuff all day - reading around and tearing my hair out! So, last resort, ask you guys - the pros!
What I'm trying to do is retrieve (in an array) of all the titles of threads here.
I am trying to use XPath to do it (unless someone can tell me a better way); currently I am just trying to get just 1 title to check whether my code is working (clearly not!...)
I'm using:
$list3 = $xpath3
->evaluate("//a[contains(@style, 'font-weight:bold') and
contains(@href, 'showthread.php?t=3499047')]");
However nothing is getting retrieved
Upvotes: 2
Views: 275
Reputation: 316959
The reason you are not getting any results is that there is no <a>
elements that satisfy both conditions.
These are the links containing "3499047" in @href:
<a href="showthread.php?s=9bc55ab5990282a5353fb20d505d577e&t=3499047" id="thread_title_3499047">Tesco misprices and discussion (Thread 12)</a>
<a href="showthread.php?s=9bc55ab5990282a5353fb20d505d577e&t=3499047">1</a>
<a href="showthread.php?s=9bc55ab5990282a5353fb20d505d577e&t=3499047&page=2">2</a>
<a href="showthread.php?s=9bc55ab5990282a5353fb20d505d577e&t=3499047&page=3">3</a>
<a href="showthread.php?s=9bc55ab5990282a5353fb20d505d577e&t=3499047&page=110">Last Page</a>
<a href="member.php?s=9bc55ab5990282a5353fb20d505d577e&find=lastposter&t=3499047" rel="nofollow">ExiledCockney</a>
<a href="misc.php?do=whoposted&t=3499047" onclick="who(3499047); return false;">2,184</a>
<a rel="shadowbox;width=732;height=527;player=iframe;" href="wow.php?t=3499047" target="_blank" style="display: block; width: 100%; height: 100%; cursor: pointer;">
<div style="width: 100%; height: 100%; background-image: url('http://images2.moneysavingexpert.com/images/forum_style_2/misc//wow_big_faint_grey.gif');">
<div style="padding: 12px 0px 0px 0px;">
<strong>3</strong>
</div>
</div>
</a>
As you can see, none of them contain "'font-weight:bold'" in a style attribute.
In case the markup on the page has elements with your desired combination when you view it in a browser, they might have been added via javascript. DOM will not run any JavaScript, so you have to check the markup fetched with DOM.
Upvotes: 2
Reputation: 191729
I took a look at that html and I don't see any links with that href that also have style="font-weight: bold;"
. I actually dono't see any bold links on the page. Anyway, when I remove that condition I get five DOMElements from evaluate()
.
Upvotes: 0
Reputation: 360572
Make sure that DOM isn't barfing on the html. it's VERY picky about malformed html. See what a ->saveHTML()
call produces immediately after loading the page. If you get out something different/truncated, your input is malformed and will have to be cleaned up first.
Upvotes: 0