Mostch Romi
Mostch Romi

Reputation: 541

Read Complex Xml file in java

I am able to read many type of xml file in java. but today i got a xml file and not able to read its details.

<ENVELOPE>
    <BILLFIXED>
        <BILLDATE>1-Jul-2017</BILLDATE>
        <BILLREF>1</BILLREF>
        <BILLPARTY>Party1</BILLPARTY>
    </BILLFIXED>
    <BILLCL>-10800.00</BILLCL>
    <BILLPDC/>
    <BILLFINAL>-10800.00</BILLFINAL>
    <BILLDUE>1-Jul-2017</BILLDUE>
    <BILLOVERDUE>30</BILLOVERDUE>
    <BILLFIXED>
        <BILLDATE>1-Jul-2017</BILLDATE>
        <BILLREF>2</BILLREF>
        <BILLPARTY>Party2</BILLPARTY>
    </BILLFIXED>
    <BILLCL>-2000.00</BILLCL>
    <BILLPDC/>
    <BILLFINAL>-2000.00</BILLFINAL>
    <BILLDUE>1-Jul-2017</BILLDUE>
    <BILLOVERDUE>30</BILLOVERDUE>
    <BILLFIXED>
        <BILLDATE>1-Jul-2017</BILLDATE>
        <BILLREF>3</BILLREF>
        <BILLPARTY>Party3</BILLPARTY>
    </BILLFIXED>
    <BILLCL>-1416.00</BILLCL>
    <BILLPDC/>
    <BILLFINAL>-1416.00</BILLFINAL>
    <BILLDUE>31-Jul-2017</BILLDUE>
    <BILLOVERDUE>0</BILLOVERDUE>
</ENVELOPE>

I am using this code for read xml file. I am able to read data inside <BILLFIXED> tag but not able to read data outside of this like <BILLFINAL> and <BILLDUE> etc.

try {
          File fXmlFile = new File("filepath");
            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
            Document doc = dBuilder.parse(fXmlFile);
            
            doc.getDocumentElement().normalize();
            NodeList billNodeList = doc.getElementsByTagName("ENVELOPE");
            for(int i=0;i<billNodeList.getLength();i++){
                Node voucherNode = billNodeList.item(i);
                Element voucherElement = (Element) voucherNode;
                NodeList nList = voucherElement.getElementsByTagName("BILLFIXED");
                
                for (int temp = 0; temp < nList.getLength(); temp++) {
                    Node insideNode = nList.item(temp);
                    Element voucherElements = (Element) insideNode;
                    System.out.println(voucherElements.getElementsByTagName("BILLDATE").item(0).getTextContent());
                    System.out.println(voucherElements.getElementsByTagName("BILLREF").item(0).getTextContent());
                    System.out.println(voucherElements.getElementsByTagName("BILLPARTY").item(0).getTextContent());
                    System.out.println(voucherElements.getElementsByTagName("BILLFINAL").item(0).getTextContent());
                    System.out.println(voucherElements.getElementsByTagName("BILLOVERDUE").item(0).getTextContent());
                }
            }
            
            
    } catch (Exception e) {
        e.printStackTrace();
    }

I am try all possible way which i know that but currently i am not able to find any solution. If anyone have any solution please share with me.

Upvotes: 0

Views: 160

Answers (3)

Krishna
Krishna

Reputation: 473

As per sample XML, it has data for 3 records. But each record does not have any separation. Looks like each field data populated into XML tag and written into file.

There 2 possible option I would suggest

  1. JAVA based : As Andreas suggested, Read the file content and add a root tag for each record which would give finite XML structure then would be easier to handle. Performance impact may raise when the input file is in large size.
  2. Transformation based : Try STX transformation which would convert the structure to required format either XML or even flat file. Then processing would be simpler

Upvotes: 0

Andreas
Andreas

Reputation: 159175

One way to do it, is to "fix" the XML to be more well-structured, e.g. like this:

// Fix the XML
Element envelopeElem = doc.getDocumentElement();
List<Node> children = new ArrayList<>();
for (Node child = envelopeElem.getFirstChild(); child != null; child = child.getNextSibling())
    children.add(child);
Element billElem = null;
for (Node child : children) {
    if (child.getNodeType() == Node.ELEMENT_NODE && "BILLFIXED".equals(child.getNodeName()))
        envelopeElem.insertBefore(billElem = doc.createElement("BILL"), child);
    if (billElem != null)
        billElem.appendChild(child);
}

The code basically creates a new <BILL> element as a child of <ENVELOPE> whenever it encounters a <BILLFIXED> element, then moves all subsequent nodes into the <BILL> element.

The result is that the XML in the DOM tree looks like this1, which should be easier for you to process:

<ENVELOPE>
    <BILL>
        <BILLFIXED>
            <BILLDATE>1-Jul-2017</BILLDATE>
            <BILLREF>1</BILLREF>
            <BILLPARTY>Party1</BILLPARTY>
        </BILLFIXED>
        <BILLCL>-10800.00</BILLCL>
        <BILLPDC/>
        <BILLFINAL>-10800.00</BILLFINAL>
        <BILLDUE>1-Jul-2017</BILLDUE>
        <BILLOVERDUE>30</BILLOVERDUE>
    </BILL>
    <BILL>
        <BILLFIXED>
            <BILLDATE>1-Jul-2017</BILLDATE>
            <BILLREF>2</BILLREF>
            <BILLPARTY>Party2</BILLPARTY>
        </BILLFIXED>
        <BILLCL>-2000.00</BILLCL>
        <BILLPDC/>
        <BILLFINAL>-2000.00</BILLFINAL>
        <BILLDUE>1-Jul-2017</BILLDUE>
        <BILLOVERDUE>30</BILLOVERDUE>
    </BILL>
    <BILL>
        <BILLFIXED>
            <BILLDATE>1-Jul-2017</BILLDATE>
            <BILLREF>3</BILLREF>
            <BILLPARTY>Party3</BILLPARTY>
        </BILLFIXED>
        <BILLCL>-1416.00</BILLCL>
        <BILLPDC/>
        <BILLFINAL>-1416.00</BILLFINAL>
        <BILLDUE>31-Jul-2017</BILLDUE>
        <BILLOVERDUE>0</BILLOVERDUE>
    </BILL>
</ENVELOPE>

1) The XML has been reformatted for human readability, i.e. it has been re-indented.

Upvotes: 1

nullTerminator
nullTerminator

Reputation: 396

It isn't well-structured XML. Inside your <envelope> tags there is nothing to indicate the start of each set of six attributes that constitute a 'bill'. You'd normally expect that each one would have a <bill> and </bill> tag to contain them. And this is going to confuse the parser...

Upvotes: 0

Related Questions