Reputation: 196
Please assist, What is meant by caching this line/new instance in java for example:
XPath xpath = XPathFactory.newInstance().newXPath();
I know I have to store insome sort of memory... may someone show me an example.
Thanks.
Upvotes: 0
Views: 550
Reputation: 163496
There are three important objects to cache:
(a) the source document. If you're running many queries against the same document, you don't want to parse the XML file repeatedly for each one. Parse it once, and save the resulting tree. Most people seem to use the default DOM tree model, but there are alternatives like JDOM2 and XOM that are much more user-friendly.
(b) the XPath engine. Initialising the XPath engine is generally expensive. If you're going to be evaluating many XPath expressions, you only want to be doing this once.
(c) Individual XPath expressions. If you need to execute the same XPath expression repeatedly, remember that the initial compiling of the expression can take 100 times as long as each evaluation of the expression.
So you want to keep these objects in memory and reuse them if at all possible.
Caching is one way of achieving this. The term "caching" generally means that your application always makes a new request when it wants to create an object, but some intermediate layer recognises that it's a request for an object that's already in memory, so it doesn't need to be created again from scratch.
So you might have a cache of source documents, so that when your application calls Document doc = fetchDocument(filename)
, the implementation of fetchDocument
keeps an in-memory hash table, and if the document is already present it gets it from the hash table, otherwise it reads and parses the file from filestore. A more sophisticated cache will discard the documents that have been least recently used to avoid the memory requirement constantly growing.
For the XPath engine it will be a trivial single-entry cache: probably something like
XPath getXPathEngine() {
if (xpath == null) xpath = XPathFactory.newInstance().newXPath();
return xpath;
}
For caching of compiled XPath expressions. If you use the Saxon API, you don't have to implement your own caching, it's done automatically behind the scenes. For example if you do:
Processor proc = new Processor();
XPathCompiler xpc = proc.newXPathCompiler();
xpc.setCaching(true);
XPathExpression exp = xpc.compile("//x");
then you can reuse the compiled expression exp
either by referencing it directly, or by making another call to compile the same expression, in which case it will be retrieved from cache.
If you are using the same XPath expressions repeatedly, then use variables to parameterise them: use the expression //x[name=$param]
and execute it repeatedly with different values for $param
, rather than building expressions such as "//x[name=' + param + "']"
where each expression has to be compiled anew. (Building expressions like this also exposes you to injection attacks.)
Upvotes: 0
Reputation: 12685
Caching means don't let the garbage collector trashing your variable after you use it, if you already know that you will need to use the same variable a bit later (but the GC does not understand that).
It really depends on how long does the Xpath
states last (may be function-scope, instance-scope or class-scope - or even a more reduced scope like a for loop or an if block, but that's only you knowing it).
The below should help to understand:
If you do this:
public Object doSomething() {
//code...
XPath xpath = XPathFactory.newInstance().newXPath();
//code...
}
..then the garbage collector will think that once you're out of the function you don't need it anymore and so, it will trash it short after. Next time you call the function again, you will have to rebuild it from scratch.
If you instead do this:
public class YourClass {
private final XPath xpath = XPathFactory.newInstance().newXPath();
public Object doSomething() {
//code...
this.xpath.use(...);
//code...
}
.. then you're doing the job only once per instance created. If you create 10 instances of your class, you'll do it 10 times. If you create just one, you'll do it just once. And the garbage collector will preserve the value of each instance as long as that instance exists.
But if this really never depends on anything, then it should be static:
public class YourClass {
private static final XPath XPATH = XPathFactory.newInstance().newXPath();
public Object doSomething() {
//code...
XPATH.use(...);
//code...
}
}
... in this last case, no matter how many instances of the class you build, you'll always have one and only one instance of Xpath, and the garbage collector will let the variable live in peace as long as your class is used / lies inside a class loader which contains used classes
(Small note: the static fields are initialized as soon as the Class
is loaded by the ClassLoader
, which loads that class and many others. The only case when the class becomes eligible to GC is when both that class and all the other classes of that class loader become unreachable. It is a very hard-to-reach state, meaning that usually, once a static field is initialized, you can be pretty safe it won't be collected until you shut down your application).
Upvotes: 3
Reputation: 4410
Suppose the code with the line above is called from a loop:
void bar() {
for (int i = 0; i < 10; i++) {
XPath xpath = XPathFactory.newInstance().newXPath();
// use xpath variable
}
}
Here 10 instances of XPath
are created. Alternatively you can hoist xpath
variable declaration out of the loop so only 1 instance will be created:
void bar() {
XPath xpath = XPathFactory.newInstance().newXPath();
for (int i = 0; i < 10; i++) {
// use xpath variable
}
}
This is the simplest case of caching, i.e. reusing some resource instead of recreating it.
Upvotes: 1