Justin Garrick
Justin Garrick

Reputation: 14947

Dealing with "Xerces hell" in Java/Maven?

In my office, the mere mention of the word Xerces is enough to incite murderous rage from developers. A cursory glance at the other Xerces questions on SO seem to indicate that almost all Maven users are "touched" by this problem at some point. Unfortunately, understanding the problem requires a bit of knowledge about the history of Xerces...

History

Problems

Conflict Resolution

For some -- or perhaps all -- of the reasons above, many organizations publish and consume custom builds of Xerces in their POMs. This is not really a problem if you have a small application and are only using Maven Central, but it quickly becomes an issue for enterprise software where Artifactory or Nexus is proxying multiple repositories (JBoss, Hibernate, etc.):

xml-apis proxied by Artifactory

For example, organization A might publish xml-apis as:

<groupId>org.apache.xerces</groupId>
<artifactId>xml-apis</artifactId>
<version>2.9.1</version>

Meanwhile, organization B might publish the same jar as:

<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.3.04</version>

Although B's jar is a lower version than A's jar, Maven does not know that they are the same artifact because they have different groupIds. Thus, it cannot perform conflict resolution and both jars will be included as resolved dependencies:

resolved dependencies with multiple xml-apis

Classloader Hell

As mentioned above, the JRE ships with Xerces in the JAXP RI. While it would be nice to mark all Xerces Maven dependencies as <exclusion>s or as <provided>, the third-party code you depend on may or may not work with the version provided in JAXP of the JDK you're using. In addition, you have the Xerces jars shipped in your servlet container to contend with. This leaves you with a number of choices: Do you delete the servlet version and hope that your container runs on the JAXP version? Is it better to leave the servlet version, and hope that your application frameworks run on the servlet version? If one or two of the unresolved conflicts outlined above manage to slip into your product (easy to happen in a large organization), you quickly find yourself in classloader hell, wondering which version of Xerces the classloader is picking at runtime and whether or not it will pick the same jar in Windows and Linux (probably not).

Solutions?

We've tried marking all Xerces Maven dependencies as <provided> or as an <exclusion>, but this is difficult to enforce (especially with a large team) given that the artifacts have so many aliases (xml-apis, xerces, xercesImpl, xmlParserAPIs, etc.). Additionally, our third party libs/frameworks may not run on the JAXP version or the version provided by a servlet container.

How can we best address this problem with Maven? Do we have to exercise such fine-grained control over our dependencies, and then rely on tiered classloading? Is there some way to globally exclude all Xerces dependencies, and force all of our frameworks/libs to use the JAXP version?


UPDATE: Joshua Spiewak has uploaded a patched version of the Xerces build scripts to XERCESJ-1454 that allows for upload to Maven Central. Vote/watch/contribute to this issue and let's fix this problem once and for all.

Upvotes: 821

Views: 166304

Answers (11)

Eduardo
Eduardo

Reputation: 2280

My friend that's very simple, here an example:

<dependency>
    <groupId>xalan</groupId>
    <artifactId>xalan</artifactId>
    <version>2.7.2</version>
    <scope>${my-scope}</scope>
    <exclusions>
        <exclusion>
        <groupId>xml-apis</groupId>
        <artifactId>xml-apis</artifactId>
    </exclusion>
</dependency>

And if you want to check in the terminal(windows console for this example) that your maven tree has no problems:

mvn dependency:tree -Dverbose | grep --color=always '(.* conflict\|^' | less -r

Upvotes: 2

Grzegorz Grzybek
Grzegorz Grzybek

Reputation: 6237

There are 2.11.0 JARs (and source JARs!) of Xerces in Maven Central since 20th February 2013! See Xerces in Maven Central. I wonder why they haven't resolved https://issues.apache.org/jira/browse/XERCESJ-1454...

I've used:

<dependency>
    <groupId>xerces</groupId>
    <artifactId>xercesImpl</artifactId>
    <version>2.11.0</version>
</dependency>

and all dependencies have resolved fine - even proper xml-apis-1.4.01!

And what's most important (and what wasn't obvious in the past) - the JAR in Maven Central is the same JAR as in the official Xerces-J-bin.2.11.0.zip distribution.

I couldn't however find xml-schema-1.1-beta version - it can't be a Maven classifier-ed version because of additional dependencies.

Upvotes: 125

jtahlborn
jtahlborn

Reputation: 53589

Frankly, pretty much everything that we've encountered works just fine w/ the JAXP version, so we always exclude xml-apis and xercesImpl.

Upvotes: 68

Derek Bennett
Derek Bennett

Reputation: 967

You should debug first, to help identify your level of XML hell. In my opinion, the first step is to add

-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
-Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

to the command line. If that works, then start excluding libraries. If not, then add

-Djaxp.debug=1

to the command-line.

Upvotes: 11

thrau
thrau

Reputation: 3100

Apparently xerces:xml-apis:1.4.01 is no longer in maven central, which is however what xerces:xercesImpl:2.11.0 references.

This works for me:

<dependency>
  <groupId>xerces</groupId>
  <artifactId>xercesImpl</artifactId>
  <version>2.11.0</version>
  <exclusions>
    <exclusion>
      <groupId>xerces</groupId>
      <artifactId>xml-apis</artifactId>
    </exclusion>
  </exclusions>
</dependency>
<dependency>
  <groupId>xml-apis</groupId>
  <artifactId>xml-apis</artifactId>
  <version>1.4.01</version>
</dependency>

Upvotes: 2

teknopaul
teknopaul

Reputation: 6762

Every maven project should stop depending on xerces, they probably don't really. XML APIs and an Impl has been part of Java since 1.4. There is no need to depend on xerces or XML APIs, its like saying you depend on Java or Swing. This is implicit.

If I was boss of a maven repo I'd write a script to recursively remove xerces dependencies and write a read me that says this repo requires Java 1.4.

Anything that actually breaks because it references Xerces directly via org.apache imports needs a code fix to bring it up to Java 1.4 level (and has done since 2002) or solution at JVM level via endorsed libs, not in maven.

Upvotes: 6

Daniel
Daniel

Reputation: 4211

There is another option that hasn't been explored here: declaring Xerces dependencies in Maven as optional:

<dependency>
   <groupId>xerces</groupId>
   <artifactId>xercesImpl</artifactId>
   <version>...</version>
   <optional>true</optional>
</dependency>

Basically what this does is to force all dependents to declare their version of Xerces or their project won't compile. If they want to override this dependency, they are welcome to do so, but then they will own the potential problem.

This creates a strong incentive for downstream projects to:

  • Make an active decision. Do they go with the same version of Xerces or use something else?
  • Actually test their parsing (e.g. through unit testing) and classloading as well as not to clutter up their classpath.

Not all developers keep track of newly introduced dependencies (e.g. with mvn dependency:tree). This approach will immediately bring the matter to their attention.

It works quite well at our organization. Before its introduction, we used to live in the same hell the OP is describing.

Upvotes: 8

netmikey
netmikey

Reputation: 2550

I know this doesn't answer the question exactly, but for ppl coming in from google that happen to use Gradle for their dependency management:

I managed to get rid of all xerces/Java8 issues with Gradle like this:

configurations {
    all*.exclude group: 'xml-apis'
    all*.exclude group: 'xerces'
}

Upvotes: 43

Ondra Žižka
Ondra Žižka

Reputation: 46796

What would help, except for excluding, is modular dependencies.

With one flat classloading (standalone app), or semi-hierarchical (JBoss AS/EAP 5.x) this was a problem.

But with modular frameworks like OSGi and JBoss Modules, this is not so much pain anymore. The libraries may use whichever library they want, independently.

Of course, it's still most recommendable to stick with just a single implementation and version, but if there's no other way (using extra features from more libs), then modularizing might save you.

A good example of JBoss Modules in action is, naturally, JBoss AS 7 / EAP 6 / WildFly 8, for which it was primarily developed.

Example module definition:

<?xml version="1.0" encoding="UTF-8"?>
<module xmlns="urn:jboss:module:1.1" name="org.jboss.msc">
    <main-class name="org.jboss.msc.Version"/>
    <properties>
        <property name="my.property" value="foo"/>
    </properties>
    <resources>
        <resource-root path="jboss-msc-1.0.1.GA.jar"/>
    </resources>
    <dependencies>
        <module name="javax.api"/>
        <module name="org.jboss.logging"/>
        <module name="org.jboss.modules"/>
        <!-- Optional deps -->
        <module name="javax.inject.api" optional="true"/>
        <module name="org.jboss.threads" optional="true"/>
    </dependencies>
</module>

In comparison with OSGi, JBoss Modules is simpler and faster. While missing certain features, it's sufficient for most projects which are (mostly) under control of one vendor, and allow stunning fast boot (due to paralelized dependencies resolving).

Note that there's a modularization effort underway for Java 8, but AFAIK that's primarily to modularize the JRE itself, not sure whether it will be applicable to apps.

Upvotes: 2

Jens Schauder
Jens Schauder

Reputation: 81852

I guess there is one question you need to answer:

Does there exist a xerces*.jar that everything in your application can live with?

If not you are basically screwed and would have to use something like OSGI, which allows you to have different versions of a library loaded at the same time. Be warned that it basically replaces jar version issues with classloader issues ...

If there exists such a version you could make your repository return that version for all kinds of dependencies. It's an ugly hack and would end up with the same xerces implementation in your classpath multiple times but better than having multiple different versions of xerces.

You could exclude every dependency to xerces and add one to the version you want to use.

I wonder if you can write some kind of version resolution strategy as a plugin for maven. This would probably the nicest solution but if at all feasible needs some research and coding.

For the version contained in your runtime environment, you'll have to make sure it either gets removed from the application classpath or the application jars get considered first for classloading before the lib folder of the server get considered.

So to wrap it up: It's a mess and that won't change.

Upvotes: 17

Travis Schneeberger
Travis Schneeberger

Reputation: 2080

You could use the maven enforcer plugin with the banned dependency rule. This would allow you to ban all the aliases that you don't want and allow only the one you do want. These rules will fail the maven build of your project when violated. Furthermore, if this rule applies to all projects in an enterprise you could put the plugin configuration in a corporate parent pom.

see:

Upvotes: 47

Related Questions