Reputation: 30222
I'm having hard time to decide between the following two implementations. I want to cache the javax.xml.parsers.DocumentBuilder object, per thread. My main concern is runtime performance - Hench I would be happy to avoid as much GC as possible. Memory is not an issue.
I've written two POC implementations, and would be happy to hear from the community PROS/CONS regarding each one.
Thanks for the help guys.
import java.io.IOException;
import java.io.StringReader;
import java.util.WeakHashMap;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class DocumentBuilder_WeakHashMap {
private static final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
private static final WeakHashMap<Thread, DocumentBuilder> CACHE = new WeakHashMap<Thread, DocumentBuilder>();
public static Document documentFromXMLString(String xml) throws SAXException, IOException, ParserConfigurationException {
DocumentBuilder builder = CACHE.get(Thread.currentThread());
if(builder == null) {
builder = factory.newDocumentBuilder();
CACHE.put(Thread.currentThread(), builder);
}
return builder.parse(new InputSource(new StringReader(xml)));
}
}
import java.io.IOException;
import java.io.StringReader;
import java.lang.ref.WeakReference;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class DocumentBuilder_ThreadLocal {
private static final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
private static final ThreadLocal<WeakReference<DocumentBuilder>> CACHE =
new ThreadLocal<WeakReference<DocumentBuilder>>() {
@Override
protected WeakReference<DocumentBuilder> initialValue() {
try {
return new WeakReference<DocumentBuilder>(factory.newDocumentBuilder());
} catch (Exception e) {
throw new RuntimeException(e);
}
}
};
public static Document documentFromXMLString(String xml) throws ParserConfigurationException, SAXException, IOException {
WeakReference<DocumentBuilder> builderWeakReference = CACHE.get();
DocumentBuilder builder = builderWeakReference.get();
if(builder == null) {
builder = factory.newDocumentBuilder();
CACHE.set(new WeakReference<DocumentBuilder>(builder));
}
return builder.parse(new InputSource(new StringReader(xml)));
}
}
They both do the same thing (expose documentFromXMLString() to the outside world) so which one would you use?
Thank you, Maxim.
Upvotes: 3
Views: 3437
Reputation: 10276
BEWARE!
ThreadLocal
will retain an indefinite reference to the DocumentBuilder
, which contains a reference to the latest XML documents parsed by that thread's DocumentBuilder
.
This has a couple of consequences, which might be considered memory leaks:
xmlparser2.jar
), this retained reference to the DocumentBuilder
will cause all classes of your web application to leak upon undeploy, eventually leading to an OutOfMemoryError
: PermGenSpace! (Google around for more info on this topic)DocumentBuilder
is large, it will keep taking up memory until a new XML document is parsed on that thread. If you have long-running threads in a thread pool (such as in a J2EE container), this might be an issue, especially if a lot of large documents need to be parsed. Yes, eventually the memory will be released, but you might run out of usable memory before that happens and the GC will not be able to clean up the XML document while a reference to the DocumentBuilder
exists.Decide if this is relevant to you or not...
Upvotes: 4
Reputation: 6494
The WeakHashMap alone will fail, because it is not thread safe:
"Like most collection classes, this class is not synchronized."
(3rd paragraph at the JavaDoc)
Since sychronization will take time and Collections.synchronizedMap
won't scale very well, you should stick with ThreadLocal
.
Upvotes: 3
Reputation: 12780
The ThreadLocal solution is better as long as you don't use the weakreference but rather use directly a ThreadLocal<DocumentBuilder>
.
Access to the ThreadLocal value is faster because the thread directly references an array containing all ThreadLocal values and it has just to compute the index in this array to do the lookup.
Look at the ThreadLocal source to see why the index computation is fast (int index = hash & values.mask;
)
Upvotes: 6