Reputation: 123
How would I find and extract an html tag that has a class name. However, in my case, I would not know the entire class name, just a single word. For example in the follow html file, I want to extract the cite tag with class ="by line vcard top-line"
, but I would only know that the class contains vcard? I'm using jsoup.
<div class="credit">
<div class="credit-text">
<cite class="byline vcard top-line">
By Taylor Hill | Takepart.com
<abbr>July 28, 2015 3:27 PM</abbr>
</cite>
<span class="bottom-line">
<a href="http://www.takepart.com/" data ylk=ltxt:TakePartcom;">
<span class="provider-name">TakePart.com</span></a>
</span>
</div>
</div>
</div>
Upvotes: 2
Views: 996
Reputation:
I just had a quick look at jsoup (first I hear of it) and it looks like you can find the desired element based on its class through the getElementsByClass(String className) method
so in your case you would use: getElementsByClass("var")
That will give you the element only. To grab its contents it looks like you would then have to call the html() method.
So your code would look more or less like this:
Elements links = content.getElementsByClass("var");
for (Element link : links) {
String linkHtmlContents = link.html();
}
http://jsoup.org/cookbook/extracting-data/dom-navigation
I believe you can achieve the same via JQuery by calling the html() function on the vcard class. As in:
$(".vcard").html()
That should return the HTML contents of the first matched element so you could do this inside a loop to get each element or alternatively use the text() function to get the contents of all elements.
For more info: http://api.jquery.com/html/
Upvotes: 1