jiman
jiman

Reputation: 189

Java InputStream / XSL Thread Safety

I have some code that was originally purposed for single threaded processing that is now being utilized by multiple threads. I encountered strange xsl parsing errors on an xsl that has been unmodified and usually works fine.

I don't seek code correction, I'll be refactoring this class for thread safety through my own research. Tips on this are welcome but not necessary for an answer.
I seek an explanation of what probably / could have happened to produce the strange errors seen below.

E.g. Thread A entered line of code X, stopped at line Y, thread B entered line X, read stream S, stopped. Thread A now performs Z to stream S and corrupted it... etc. My vague hunch is that the object 'xsl' somehow got corrupted when 2 threads manipulated/read it at once, but I lack knowledge of how InputStream objects and Source objects work inside the api.

XSL:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  ... 
</xsl:stylesheet>

Style sheet should be fine. Let me know if you think otherwise.

Error:

[Fatal Error] :1:33: The encoding declaration is required in the text declaration.
[Fatal Error] :1:2: Content is not allowed in prolog.
Couldn't create transformer object for xsl javax.xml.transform.TransformerConfigurationException: Could not compile stylesheet

These errors happened many times and persisted until an app restart. These clearly point to a bad xsl. But I did verify the file to be not corrupt.

Java code:

public class IntegrationRequest {

  private static Source xsl; // data that i believe got corrupted
  private static Transformer inputTransformer;
  private static JAXBContext createContext;

  ...

  // this public method is now executed by multiple threads at once
  public String getResponse()  {
    createTransformerInstance(); // err thrown in here
    String xmlRequest = getXmlRequestString();
    String response = executeRequest(createHttpConnection(), xmlRequest);
    return response;
  }

  private static Transformer createTransformerInstance() {
    if (inputTransformer == null) {
        TransformerFactory factory =  TransformerFactory.newInstance();
        try {
            inputTransformer = factory.newTransformer(getXSLTemplate("/Integration_Request.xsl"));
        } catch (TransformerConfigurationException e) {
            if (log.isErrorEnabled()) {
                log.error("Couldn't create transformer object for xsl ", e);
            }
        }
    }
    return inputTransformer;
  }

  private static Source getXSLTemplate(String fileName) {
    if (xsl == null) {
        InputStream xslInputStream = IntegrationRequest.class.getClassLoader().getResourceAsStream(fileName);
        xsl = new StreamSource(xslInputStream);
    }
    return xsl;
  }

}

Note how getXSLTemplate caches xsl - the app will keep that bad data until restart. This is the basis of my theory that it got corrupted by multiple threads on first load.

Upvotes: 0

Views: 1329

Answers (3)

Michael Kay
Michael Kay

Reputation: 163458

A StreamSource is consumed by use: in general, you can't use it more than once. (This is one of the design faults in JAXP: some kinds of Source are consumed by use, and others aren't, and the documentation doesn't make this clear.) So caching the StreamSource is not an appropriate thing to do. The thing you should cache is the JAXP Templates object, which represents a compiled stylesheet. Apart from anything else, there's a lot of work in creating a Templates object from a StreamSource, and you don't want to repeat this work. The Templates object is guaranteed thread-safe.

Upvotes: 1

wero
wero

Reputation: 33000

The errors show that parsing of the stylesheet has failed - the XML parser did not see valid XML when reading the stylesheet document. This has most likely happened because two or more threads started to create the stylesheet, all using the same InputStream in the xsl variable. A possible scenario is:

  • Thread 1 calls getResponse for the first time, xsl is created, thread 1 starts reading from xsl.
  • Thread 2 calls getResponse, also doesn't see a non-null transformer, starts reading from the same xsl, corrupting the parse process in thread 1.

Note that Transformer also is not a thread-safe class. You should use a javax.xml.transform.Templates instead and create a Transformer from the Templates for each transformation.

Upvotes: 1

Jim Garrison
Jim Garrison

Reputation: 86774

You are not going to get a detailed explanation of exactly what happened.

You have static variables that are being reused when multiple threads use the class, and those threads will step all over each other. This must be refactored so that each thread can have its own instance of your class, with all variable being instance variables. The details of how to accomplish the best refactoring depend on information you haven't provided.

This falls under the umbrella of "undefined behavior", which includes everything from continuing to work correctly, through exceptions, corrupted data (likely), the JVM crashing (unlikely), or the CPU throwing off a shower of sparks like in the movies (dramatic, but really unlikely).

Upvotes: 1

Related Questions