Reputation: 3
I'm trying to extract the overall comments number from a web page using Jsoup. For example, here is a page (CNN): http://edition.cnn.com/2011/POLITICS/07/31/debt.talks/index.html?hpt=T1
I see that the class ID is cnn_strycmtsndff, but can't get to find the right command to extract it.
Can someone help?
Thanks
Upvotes: 0
Views: 553
Reputation: 138
Unfortunately, I don't think Jsoup is going to cut it. If you use the Chrome developer tools you can clearly pick out the HTML used for presenting the "(##### Comments)" section, but if you just view the source, none of that information is there. It seems like they are using some Javascript to dynamically embed the information in the page.
This is what you see in "View Source":
<div id="disqus_thread"></div><script type="text/javascript" src="http://cnn.disqus.com/embed.js"></script>
So Jsoup will never be able to see the elements with the comment information.
Upvotes: 1