BountyHunter
BountyHunter

Reputation: 1411

Parse XML and get data inside value attribute of tags

I am new to XML parsing. I have read about DOM and SAX parser and tried a few sample implementation. However, I am unable to parse the following XML data

<?xml version="1.0" ?>
<collection>
<action value="submit"/>
<protocol_version value="1"/>
<reponse value="Success"/>
<batch>
    <sample>
        <count value="1"/>
        <count2 value="2"/>
        <count3 value="3"/>
    </sample>
    <sample_2>
        <date value="10/10/2010"/>
        <page value="SampleData"/>
        <track value="123123123"/>
        <same value="1.00"/>
        <data>
            <first_name value="Jeffrey"/>
            <SSID value="1231231231"/>
            <last_name value="Chuckle"/>
            <field1 value="123123123"/>
            <field2 value="Sam E. Bonzella"/>
            <field3 value="SOME VALUE"/>
            <field4 value="SOME VALUE 2"/>
            <field5 value="TEXT"/>
            <field6 value="12312"/>
        </data>
    </sample_2>
</batch>
</collection>

Below is sample code I tried implementing but it requires repetative code and also, the data is not organised. I also tried JAXB parser but was unable to fetch the value attribute.

public class test {
public static void main(String[] args){

    try {
        File inputFile = new File("staff.xml");
        DocumentBuilderFactory dbFactory
                = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(inputFile);
        doc.getDocumentElement().normalize();
        System.out.println("Base :"
                + doc.getDocumentElement().getNodeName());
        NodeList nList = doc.getElementsByTagName("action");
        for (int temp = 0; temp < nList.getLength(); temp++) {
            Node nNode = nList.item(temp);
            System.out.println("Element :"
                    + nNode.getNodeName());
            if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                Element eElement = (Element) nNode;
                System.out.println("Action : "
                        + eElement.getAttribute("value"));
            }
        }
        nList = doc.getElementsByTagName("transaction_count");
        for (int temp = 0; temp < nList.getLength(); temp++) {
            Node nNode = nList.item(temp);
            System.out.println("Element :"
                    + nNode.getNodeName());
            if (nNode.getNodeType() == Node.ELEMENT_NODE) {
                Element eElement = (Element) nNode;
                System.out.println("transaction_count : "
                        + eElement.getAttribute("value"));
            }
        }


    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

Ideally I wish to parse the data to an array or may be Map.

Upvotes: 2

Views: 7077

Answers (1)

gr7
gr7

Reputation: 82

getElementsByTagName(String name) is not useful in that case as all tag names should be provided.

XML above contains elements that can be grouped into two categories:

  1. Elements with values - if I understand question correctly, tagname and value should be stored in map

  2. Elements without values. They contain another elements. Tagname should not be stored.

Elements can be parsed recursively. If element contains attribute "value" then it should be stored in map. Otherwise, child nodes of that element should be checked.

    public static void main(String argv[]) {

    Map<String, String> map = new LinkedHashMap<>();

    try {
        File fXmlFile = new File("staff.xml");
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(fXmlFile);
        doc.getDocumentElement().normalize();

        NodeList collectionNodeList = doc.getElementsByTagName("collection");
        Element collectionElement = (Element) collectionNodeList.item(0);
        findElementsWithValues(map, collectionElement);

    } catch (Exception e) {
        e.printStackTrace();
    }

    System.out.println("Found values: " + map.size());
    System.out.println(map);
}

private static void findElementsWithValues(Map<String, String> map, Element rootElement) {
    NodeList childNodes = rootElement.getChildNodes();
    for (int i = 0; i < childNodes.getLength(); i++) {
        Node node = childNodes.item(i);
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            Element element = (Element) node;
            String value = element.getAttribute("value");
            if (!value.isEmpty()) {
                String tagName = element.getTagName();
                map.put(tagName, value);
            }else{
                findElementsWithValues(map, element);
            }
        }
    }
}

Output (after corrections in XML file above to make it parsable)

Found values: 19
{action=submit, protocol_version=1, reponse=Success, count=1, count2=2, count3=3, date=10/10/2010, page=SampleData, track=123123123, same=1.00, first_name=Jeffrey, SSID=1231231231, last_name=Chuckle, field1=123123123, field2=Sam E. Bonzella, field3=SOME VALUE, field4=SOME VALUE 2, field5=TEXT, field6=12312}

Upvotes: 3

Related Questions