Reputation: 147
My problem is: how can I search a word or a phrase in the page selected with Jsoup.
For example if the word or phrase in in a span how can I find per example the text next to this <span>
? For example a link?
Html example code:
...
<div class="div">
<span>my y favourite text </span>
<a href="www.mylink.com">my link </a>
</div>
....
From this example how to find that my word is favourite and I also want to retrieve the link in <a href>
?
Upvotes: 0
Views: 909
Reputation: 2875
Target: get text in a span
and href
attribute of a sibling a
element, if the span
contains a specified search word.
One way is to look for a a
having the href
attribute set, that has a preceding sibling span
element. Then select the parent element and therein the span
element to compare the content. For the parsing of a DOM tree, jsoup is a good option.
Example Code
String source = "<div class=\"div\"><span>my y favourite text </span><a href=\"http://www.mylink.com\">my link </a></div>" +
"<div class=\"div\"><span>my y favourite 2 text </span><a href=\"/some-link.html\">my link 1</a></div>" +
"<div class=\"div\"><span>my y text </span><a href=\"http://www.mylink.com\">my link 2</a></div>";
String searchWord = "favourite";
Document doc = Jsoup.parse(source, "UTF-8");
doc.setBaseUri("http://some-source.com"); // only for absolute links in local example
Element parent;
String spanContent="";
String link = "";
for (Element el : doc.select("span ~ a[href]")) {
parent = el.parent();
if(parent.select("span").text().contains(searchWord)){
spanContent = parent.select("span").first().text();
link = parent.select("a[href]").first().absUrl("href");
System.out.println(spanContent + " -> " + link); // do something useful with the matches
}
}
Output
my y favourite text -> http://www.mylink.com
my y favourite 2 text -> http://some-source.com/some-link.html
Upvotes: 2