Reputation: 33213
I have the following xml.. and I am trying to parse it.
<employee>
<personal>
<id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
<name>Lareina</name>
<age>50</age>
</personal>
<contact>
<dept>Fusce</dept>
<manager>CB9A0BB76</manager>
</contact>
</employee>
But.. well... I am not able to do so.. Posting my code.. but my code works for "proper" formatted xml though? (uncomment "xmlString")
public class XMLReader {
public static void main(String[] args) throws JDOMException, IOException {
//String xmlString = "<employee >\n <firstname xml:space=\"preserve\" >John</firstname>\n <lastname>Watson</lastname>\n <age>30</age>\n <email>[email protected]</email>\n</employee>";
String xmlString = "<employee>\n" +
" <personal><id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>\n" +
" <name>Lareina</name>\n" +
" <age>50</age>\n" +
" </personal><contact><dept>Fusce</dept>\n" +
" <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager></contact>\n" +
" </employee>";
System.out.println(xmlString);
SAXBuilder builder = new SAXBuilder();
Reader in = new StringReader(xmlString);
Document doc = builder.build(in);
Element root = doc.getRootElement();
List children = root.getChildren();
//System.out.println(children);
String value = "";
for (int i = 0; i < children.size(); i++) {
Element dataNode = (Element) children.get(i);
// Element dataNode = (Element) dataNodes.get(j);
value += ", " +dataNode.getText().trim();
System.out.println(dataNode.getName() + " : " + dataNode.getText());
//context.write(new Text(rowKey.toString()), new Text(node.getName().trim() + " " + node.getText().trim()));
}
//System.out.println(in);
}
}
Upvotes: 1
Views: 101
Reputation: 279880
Your two xml strings are different. The first is
<employee>
<firstname xml:space="preserve">John</firstname>
<lastname>Watson</lastname>
<age>30</age>
<email>[email protected]</email>
</employee>
Which has four (4) children that each has text. So it prints
firstname : John
lastname : Watson
age : 30
email : [email protected]
And the second is
<employee>
<personal>
<id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
<name>Lareina</name>
<age>50</age>
</personal>
<contact>
<dept>Fusce</dept>
<manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager>
</contact>
</employee>
In this last one, you get two children personal
and contact
which have no text. So you get output like
personal :
contact :
This is the expected output.
Upvotes: 2