Reputation: 307
I'm still new to html. For an Android project, I need to extract some data from an html string using jsoup. The structure is something like this. All the span tags have the same class name. And the data I need is in between each of those.
<span class="head">a</span>
xxxx data xxxx
<span class="head">b</span>
xxxx data xxxx
<span class="head">c</span>
xxxx data xxxx
Is there any way I could extract it?
Upvotes: 1
Views: 990
Reputation: 56
Try this code working fine.
public class JsoupExample {
public static void main(String[] args) {
String html = "<span class=\"head\">a</span>\n" +
"xxxx data xxxx\n" +
"<span class=\"head\">b</span>\n" +
"xxxx data xxxx\n" +
"<span class=\"head\">c</span>\n" +
"xxxx data xxxx";
Document document = Jsoup.parse(html);
for (Element element: document.select("span.head")) {
System.out.println(element.text());
}
}
}
Upvotes: 0
Reputation: 42194
There are 2 things you have to do:
nextSibling
method to get the text node.Take a look at this sample code: import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.nodes.TextNode;
public class JsoupExample {
public static void main(String[] args) {
String html = "<span class=\"head\">a</span>\n" +
"xxxx data xxxx\n" +
"<span class=\"head\">b</span>\n" +
"xxxx data xxxx\n" +
"<span class=\"head\">c</span>\n" +
"xxxx data xxxx";
Document document = Jsoup.parse(html);
for (Element span : document.select("span.head")) {
TextNode node = (TextNode) span.nextSibling();
assert "xxxx data xxxx".equals(node.text());
System.out.println(node.text());
}
}
}
It uses your input and shows both steps.
Here document.select("span.head")
we select all elements with class head
, then we iterate over those elements using forEach(span -> {})
function and lambda expression (this is Java 8 example). Then we get interesting text node using: TextNode node = (TextNode) span.nextSibling();
Here we just check if text node equals the value we expect by using assertion and we simply display it to standard output.
Modify this code sample for your needs. I hope it helps you.
Upvotes: 3