Concatenate two org.w3c.dom.Document

Question

I would like to concatenate two org.w3c.dom.Document s, I have something like this:

Document finalDocument = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument()
Document document1 = createDocumentOne();
Document document2 = createDocumentTwo();

// This didn't work
changeFileDocument.appendChild(document1);
changeFileDocument.appendChild(document2);

The format of the document1 and document2 is something like this:


    value

And what I want is, at the end, a Document like this:


    valueForDocument1  


    valueForDocument2

I think that you cannot do this, because they should have a common parent. If so, I would like to create that "fake" parent, concatenate the files, but then only recover the List of elements headerTag

How can I do this?

A4L · Accepted Answer

You were on the right track with creating a new Document, parsing the parts and add their nodes to the new ones.

Your approach failed because you tried to append a whole document to another one which is not possible.

You could try something like this:

public org.w3c.dom.Document concatXmlDocuments(String rootElementName, InputStream... xmlInputStreams) throws ParserConfigurationException, SAXException, IOException {
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    org.w3c.dom.Document result = builder.newDocument();
    org.w3c.dom.Element rootElement = result.createElement(rootElementName);
    result.appendChild(rootElement);
    for(InputStream is : xmlInputStreams) {
        org.w3c.dom.Document document = builder.parse(is);
        org.w3c.dom.Element root = document.getDocumentElement();
        NodeList childNodes = root.getChildNodes();
        for(int i = 0; i < childNodes.getLength(); i++) {
            Node importNode = result.importNode(childNodes.item(i), true);
            rootElement.appendChild(importNode);
        }
    }
    return result;
}

The code above copies all nodes found under the root element of each document. Of course you can choose to selectively copy only the nodes you are interested in. The resulting document will reflect all the nodes from both documents.

Test

@Test
public void concatXmlDocuments() throws ParserConfigurationException, SAXException, IOException, TransformerException {
    try (
            InputStream doc1 = new ByteArrayInputStream((
                "
" + 
                "    doc1 value
" + 
                "").getBytes(StandardCharsets.UTF_8));
            InputStream doc2 = new ByteArrayInputStream((
                "
" + 
                "    doc2 value
" + 
                "").getBytes(StandardCharsets.UTF_8));
            ByteArrayOutputStream docR = new ByteArrayOutputStream();

        ) {

        org.w3c.dom.Document result = concatXmlDocuments("headerTag", doc1, doc2);
        TransformerFactory trf = TransformerFactory.newInstance();
        Transformer tr = trf.newTransformer();
        tr.setOutputProperty(OutputKeys.INDENT, "yes");
        DOMSource source = new DOMSource(result);
        StreamResult sr = new StreamResult(docR);
        tr.transform(source, sr);
        System.out.print(new String(docR.toByteArray(), StandardCharsets.UTF_8));
    }
}

Output



    doc1 value
    doc2 value

EDIT

I would like to create that "fake" parent, concatenate the files, but then only recover the List of elements headerTag

As you say, create a fake parent. Here is how you could do it:

1) Do the concatenation

public org.w3c.dom.Document concatXmlDocuments(InputStream... xmlInputStreams) throws ParserConfigurationException, SAXException, IOException {
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    org.w3c.dom.Document result = builder.newDocument();
    org.w3c.dom.Element rootElement = result.createElement("fake");
    result.appendChild(rootElement);
    for(InputStream is : xmlInputStreams) {
        org.w3c.dom.Document document = builder.parse(is);
        org.w3c.dom.Element subRoot = document.getDocumentElement();
        Node importNode = result.importNode(subRoot, true);
        rootElement.appendChild(importNode);
    }
    return result;
}

2) Recover the node list for headerTag

public NodeList recoverTheListOfElementsHeaderTag(String xml) throws ParserConfigurationException, SAXException, IOException {
    NodeList listOfElementsHeaderTag = null;
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    try (InputStream is = new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8))) {
        listOfElementsHeaderTag = recoverTheListOfElementsHeaderTag(builder.parse(is));
    }
    return listOfElementsHeaderTag;
}

public NodeList recoverTheListOfElementsHeaderTag(org.w3c.dom.Document doc) {
    org.w3c.dom.Element root = doc.getDocumentElement();
    return root.getChildNodes();
}

Test

@Test
public void concatXmlDocuments() throws ParserConfigurationException, SAXException, IOException, TransformerException {
    try (
            InputStream doc1 = new ByteArrayInputStream((
                "" + 
                "doc1 value" + 
                "").getBytes(StandardCharsets.UTF_8));
            InputStream doc2 = new ByteArrayInputStream((
                "" + 
                "doc2 value" + 
                "").getBytes(StandardCharsets.UTF_8));

        ) {

        org.w3c.dom.Document result = concatXmlDocuments(doc1, doc2);
        String resultXML = toXML(result);
        System.out.printf("%s%n", resultXML);
        NodeList listOfElementsHeaderTag = null;
        System.out.printf("===================================================%n");
        listOfElementsHeaderTag = recoverTheListOfElementsHeaderTag(resultXML);
        printNodeList(listOfElementsHeaderTag);
        System.out.printf("===================================================%n");
        listOfElementsHeaderTag = recoverTheListOfElementsHeaderTag(result);
        printNodeList(listOfElementsHeaderTag);
    }
}


private String toXML(org.w3c.dom.Document result) throws TransformerFactoryConfigurationError, TransformerConfigurationException, TransformerException, IOException {
    String resultXML = null;
    try (ByteArrayOutputStream docR = new ByteArrayOutputStream()) {
        TransformerFactory trf = TransformerFactory.newInstance();
        Transformer tr = trf.newTransformer();
        DOMSource source = new DOMSource(result);
        StreamResult sr = new StreamResult(docR);
        tr.transform(source, sr);
        resultXML = new String(docR.toByteArray(), StandardCharsets.UTF_8);
    }
    return resultXML;
}

private void printNodeList(NodeList nodeList) {
    for(int i = 0; i < nodeList.getLength(); i++) {
        printNode(nodeList.item(i), "");
    }
}

private void printNode(Node node, String startIndent) {
    if(node != null) {
        System.out.printf("%s%s%n", startIndent, node.toString());
        NodeList childNodes = node.getChildNodes();
        for(int i = 0; i < childNodes.getLength(); i++) {
            printNode(childNodes.item(i), startIndent+ "    ");
        }
    }
}

Output

doc1 valuedoc2 value
===================================================
[headerTag: null]
    [tag1: null]
        [#text: doc1 value]
[headerTag: null]
    [tag1: null]
        [#text: doc2 value]
===================================================
[headerTag: null]
    [tag1: null]
        [#text: doc1 value]
[headerTag: null]
    [tag1: null]
        [#text: doc2 value]

Concatenate two org.w3c.dom.Document

Answers (2)

Related Questions