doublemc
doublemc

Reputation: 3311

Can't parse XML (from web) using JSoup

I am trying to work with small XML files sent from web and parse few attributes from them. How would I approach this in JSoup? I know it's not XML Parser but HTML one but it supports XML too and I don't have to build any Handlers, BuildFactories and such as I would have to in DOM, SAX etc.

Here is example xml: LINK I can't paste it here because it exits the code tag after every line - if someone can fix that I would be grateful.

And here is my piece of code::

String xml = "http://www.omdbapi.com/?t=Private%20Ryan&y=&plot=short&r=xml";
Document doc = Jsoup.parse(xml, "", Parser.xmlParser());
// want to select first occurrence of genre tag though there is only one it 
// doesn't work without .first() - but it doesn't parse it
Element genreFromXml = doc.select("genre").first();
String genre = genreFromXml.text();
System.out.println(genre);

It results in NPE at:

String genre = genreFromXml.text();

Upvotes: 0

Views: 2386

Answers (1)

Nicolas Filotto
Nicolas Filotto

Reputation: 45005

There are 2 issues in your code:

  1. You provide a String representation of an URL while an XML content is expected, you should rather use the method parse(InputStream in, String charsetName, String baseUri, Parser parser) instead to parse your XML as an input stream.
  2. There is no element genre in your XML, genre is an attribute of the element movie.

Here is how your code should look like:

String url = "http://www.omdbapi.com/?t=Private%20Ryan&y=&plot=short&r=xml";
// Parse the doc using an XML parser
Document doc = Jsoup.parse(new URL(url).openStream(), "UTF-8", "", Parser.xmlParser());
// Select the first element "movie"
Element movieFromXml = doc.select("movie").first();
// Get its attribute "genre"
String genre = movieFromXml.attr("genre");
// Print the result
System.out.println(genre);

Output:

Drama, War

Upvotes: 3

Related Questions