user801154
user801154

Reputation:

How to extract JavaScript link from HTML page in Java?

I have HTML pages as String in Java and I need to extract the JavaScript links from it. Is there any good and easy to use library that I can use? I looked up Cobra and Neko, but I don't think (maybe I'm wrong) that they have what I need, such as getting tag specific content.

Upvotes: 0

Views: 899

Answers (1)

nicholas.hauschild
nicholas.hauschild

Reputation: 42849

Take a look at JSoup. It is an HTML parser that has a selector-DSL (Domain Specific Language) for finding elements of the dom.

For example, to find all a tags with an href, you would do this:

Document doc = Jsoup.connect("http://www.google.com/").get();
Elements hrefAnchors = doc.select("a[href]"); 

If you already have the html downloaded as a String, you can use the parse(String) method:

String html = "<p>Welcome to <a href='http://www.google.com/'>Google</a>.</p>";
Document doc = Jsoup.parse(html);

Upvotes: 1

Related Questions