David Portabella
David Portabella

Reputation: 12720

Java XML library that preserves attribute order

I am writing a Java program that reads an XML file, makes some modifications, and writes back the XML.

Using the standard Java XML DOM API, the order of the attributes is not preserved.

That is, if I have an input file such as:

<person first_name="john" last_name="lederrey"/>

I might get an output file as:

<person last_name="lederrey" first_name="john"/>

That's correct, because the XML specification says that order attribute is not significant.

However, my program needs to preserve the order of the attributes, so that a person can easily compare the input and output document with a diff tool.

One solution for that is to process the document with SAX (instead of DOM): Order of XML attributes after DOM processing

However, this does not work for my case, because the transformation I need to do in one node might depend on a XPath expression on the whole document.

So, the simplest thing would be to have a XML library very similar to the standard Java DOM library, with the exception that it preserves the attribute order.

Is there such a library?

PS: Please, avoid discussing whether I should the preserve attribute order or not. This is a very interesting discussion, but it is not the point of this question.

Upvotes: 11

Views: 9697

Answers (7)

Michael Kay
Michael Kay

Reputation: 163458

Saxon these days offers a serialization option[1] to control the order in which attributes are output. It doesn't retain the input order (because Saxon doesn't know the input order), but it does allow you to control, for example, that the ID attribute always appears first.

And this can be very useful if the XML is going to be hand-edited; XML in which the attributes appear in the "wrong" order can be very disorienting to a human reader or editor.

If you're using this as part of a diff process then you would want to put both files through a process that normalizes the attribute order before comparing them. However, for comparing files my preferred approach is to parse them both and use the XPath deep-equal() function; or to use a specialized tool like DeltaXML.

[1] saxon:attribute-order - see http://www.saxonica.com/documentation/index.html#!extensions/output-extras/serialization-parameters

Upvotes: 4

Haroldo_OK
Haroldo_OK

Reputation: 7260

You might also want to try DecentXML, as it can preserve the attribute order, comments and even indentation.

It is very nice if you need to programmatically update an XML file that's also supposed to be human-editable. We use it for one of our configuration tools.

-- edit --

It seems it is no longer available on its original location; try these ones:

Upvotes: 2

IvanNik
IvanNik

Reputation: 2027

You may override AttributeSortedMap and sort attributes as you need...

The main idea: load the document, recursively copy to elements that support sorted attributeMap and serialize using the existing XMLSerializer.

File test.xml

<root>
    <person first_name="john1" last_name="lederrey1"/>
    <person first_name="john2" last_name="lederrey2"/>
    <person first_name="john3" last_name="lederrey3"/>
    <person first_name="john4" last_name="lederrey4"/>
</root>

File AttOrderSorter.java

import com.sun.org.apache.xerces.internal.dom.AttrImpl;
import com.sun.org.apache.xerces.internal.dom.AttributeMap;
import com.sun.org.apache.xerces.internal.dom.CoreDocumentImpl;
import com.sun.org.apache.xerces.internal.dom.ElementImpl;
import com.sun.org.apache.xml.internal.serialize.OutputFormat;
import com.sun.org.apache.xml.internal.serialize.XMLSerializer;
import org.w3c.dom.*;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.Writer;
import java.util.List;

import static java.util.Arrays.asList;

public class AttOrderSorter {

    private List<String> sortAtts = asList("last_name", "first_name");

    public void format(String inFile, String outFile) throws Exception {
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = dbFactory.newDocumentBuilder();
        Document outDocument = builder.newDocument();
        try (FileInputStream inputStream = new FileInputStream(inFile)) {
            Document document = dbFactory.newDocumentBuilder().parse(inputStream);
            Element sourceRoot = document.getDocumentElement();
            Element outRoot = outDocument.createElementNS(sourceRoot.getNamespaceURI(), sourceRoot.getTagName());
            outDocument.appendChild(outRoot);

            copyAtts(sourceRoot.getAttributes(), outRoot);
            copyElement(sourceRoot.getChildNodes(), outRoot, outDocument);
        }

        try (Writer outxml = new FileWriter(new File(outFile))) {

            OutputFormat format = new OutputFormat();
            format.setLineWidth(0);
            format.setIndenting(false);
            format.setIndent(2);

            XMLSerializer serializer = new XMLSerializer(outxml, format);
            serializer.serialize(outDocument);
        }
    }

    private void copyElement(NodeList nodes, Element parent, Document document) {
        for (int i = 0; i < nodes.getLength(); i++) {
            Node node = nodes.item(i);
            if (node.getNodeType() == Node.ELEMENT_NODE) {
                Element element = new ElementImpl((CoreDocumentImpl) document, node.getNodeName()) {
                    @Override
                    public NamedNodeMap getAttributes() {
                        return new AttributeSortedMap(this, (AttributeMap) super.getAttributes());
                    }
                };
                copyAtts(node.getAttributes(), element);
                copyElement(node.getChildNodes(), element, document);

                parent.appendChild(element);
            }
        }
    }

    private void copyAtts(NamedNodeMap attributes, Element target) {
        for (int i = 0; i < attributes.getLength(); i++) {
            Node att = attributes.item(i);
            target.setAttribute(att.getNodeName(), att.getNodeValue());
        }
    }

    public class AttributeSortedMap extends AttributeMap {
        AttributeSortedMap(ElementImpl element, AttributeMap attributes) {
            super(element, attributes);
            nodes.sort((o1, o2) -> {
                AttrImpl att1 = (AttrImpl) o1;
                AttrImpl att2 = (AttrImpl) o2;

                Integer pos1 = sortAtts.indexOf(att1.getNodeName());
                Integer pos2 = sortAtts.indexOf(att2.getNodeName());
                if (pos1 > -1 && pos2 > -1) {
                    return pos1.compareTo(pos2);
                } else if (pos1 > -1 || pos2 > -1) {
                    return pos1 == -1 ? 1 : -1;
                }
                return att1.getNodeName().compareTo(att2.getNodeName());
            });
        }
    }

    public void main(String[] args) throws Exception {
        new AttOrderSorter().format("src/main/resources/test.xml", "src/main/resources/output.xml");
    }
}

Result - file output.xml

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <person last_name="lederrey1" first_name="john1"/>
  <person last_name="lederrey2" first_name="john2"/>
  <person last_name="lederrey3" first_name="john3"/>
  <person last_name="lederrey4" first_name="john4"/>
</root>

Upvotes: 0

Erikson
Erikson

Reputation: 579

We had similar requirements per Dave's description. A solution that worked was based on Java reflection.

The idea is to set the propOrder for the attributes at runtime. In our case there's APP_DATA element containing three attributes: app, key, and value. The generated AppData class includes "content" in propOrder and none of the other attributes:

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "AppData", propOrder = {
    "content"
})
public class AppData {

    @XmlValue
    protected String content;
    @XmlAttribute(name = "Value", required = true)
    protected String value;
    @XmlAttribute(name = "Name", required = true)
    protected String name;
    @XmlAttribute(name = "App", required = true)
    protected String app;
    ...
}

So Java reflection was used as follows to set the order at runtime:

final String[] propOrder = { "app", "name", "value" };
ReflectionUtil.changeAnnotationValue(
        AppData.class.getAnnotation(XmlType.class),
        "propOrder", propOrder);

final JAXBContext jaxbContext = JAXBContext
        .newInstance(ADI.class);
final Marshaller adimarshaller = jaxbContext.createMarshaller();
adimarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT,
        true);

adimarshaller.marshal(new JAXBElement<ADI>(new QName("ADI"),
                                           ADI.class, adi),
                      new StreamResult(fileOutputStream));

The changeAnnotationValue() was borrowed from this post: Modify a class definition's annotation string parameter at runtime

Here's the method for your convenience (credit goes to @assylias and @Balder):

/**
 * Changes the annotation value for the given key of the given annotation to newValue and returns
 * the previous value.
 */
@SuppressWarnings("unchecked")
public static Object changeAnnotationValue(Annotation annotation, String key, Object newValue) {
    Object handler = Proxy.getInvocationHandler(annotation);
    Field f;
    try {
        f = handler.getClass().getDeclaredField("memberValues");
    } catch (NoSuchFieldException | SecurityException e) {
        throw new IllegalStateException(e);
    }
    f.setAccessible(true);
    Map<String, Object> memberValues;
    try {
        memberValues = (Map<String, Object>) f.get(handler);
    } catch (IllegalArgumentException | IllegalAccessException e) {
        throw new IllegalStateException(e);
    }
    Object oldValue = memberValues.get(key);
    if (oldValue == null || oldValue.getClass() != newValue.getClass()) {
        throw new IllegalArgumentException();
    }
    memberValues.put(key, newValue);
    return oldValue;
}

Upvotes: 0

Mike Thomsen
Mike Thomsen

Reputation: 37506

Your best bet would be to use StAX instead of DOM for generating the original document. StAX gives you a lot of fine control over these things and lets you stream output progressively to an output stream instead of holding it all in memory.

Upvotes: 0

Bob Dalgleish
Bob Dalgleish

Reputation: 8227

Do it twice:

Read the document in using a DOM parser so you have references, a repository, if you will.

Then read it again using SAX. At the point where you need to make the transformation, reference the DOM version to determine what you need, then output what you need in the middle of the SAX stream.

Upvotes: 2

fla
fla

Reputation: 143

You can't use the DOM, but you can use SAX, or querying children using XPath.

Visit the answer Order of XML attributes after DOM processing.

Upvotes: -1

Related Questions