Jessy Jameson
Jessy Jameson

Reputation: 207

Extract links from document jsoup containing some string to other string

i use jsoup to extract the links from a website. i want to extract one only specified link containg some keywords. i want to retrieve the links contains the keyword "download". how to do it. i have the following code

Document doc = Jsoup.parse( new URL("http://www.examplesite.com)); 
Element link = doc.select("a").first();

Upvotes: 3

Views: 5133

Answers (2)

Ahmed Soliman
Ahmed Soliman

Reputation: 1710

you can use this

elements with attributes that start with [attr^=value],end with [attr$=value],contain the value [attr*=value] e.g. [href*=/path/]

you want to get the links containing certain word use this

org.jsoup.select.Elements links = doc.select("[href*=download]");

Upvotes: 0

Hauke Ingmar Schmidt
Hauke Ingmar Schmidt

Reputation: 11607

See here for the selector syntax.

You can test for the text within a node with :contains, e.g. Element link = doc.select("a:contains(Download)").first();. If you want you can use :matches for regex.

You get the link address via the attr method, e.g. String linkaddress = link.attr("href");.

Upvotes: 5

Related Questions