Reputation: 1712
I am trying to understand the JVM and HotSpot optimizers internals.
I tackle the problem of initializing object tree structures with an awful lot of nodes as fast as possible. Right now, for every tree structure given, we generate Java source code to initialize the tree as following. In the end, we have thousands of these classes.
public class TypeATreeNodeInitializer {
public TypeATreeNode initialize(){
return getTypeATree();
}
private TypeATreeNode getTypeATree() {
TypeATreeNode node = StaticTypeAFactory.create();
TypeBTreeNode child1 = getTypeBTreeNode1();
node.getChildren().add(child1);
TypeBTreeNode child2 = getTypeBTreeNode2();
node.getChildren().add(child2);
//... may be many more children
return node;
}
private TypeBTreeNode getTypeBTreeNode1() {
TypeBTreeNode node = StaticTypeBFactory.create();
TypeBTreeNode child1 = getTypeCTreeNode1();
node.getChildren().add(child1);
//store of value in variable first
String value1 = "Some value";
// assign value to node
node.setSomeValue(value1);
boolean value2 = false;
node.setSomeBooleanValue(value2);
return node;
}
private TypeBTreeNode getTypeCTreeNode1() {
// ...
return null;
}
private TypeBTreeNode getTypeBTreeNode2() {
// ...
return null;
}
//... many more child node getter / initializer
}
As you can see, the values to be assigned to the tree nodes are stored inside local variables first. Looking at the generated byte code, this results in:
A load of the variable from the constant pool to the stack // e.g. String “Some Value”
A store of the variable inside the local variables
A load from the method target onto the stack // e.g. TypeBTreeNode
A load of the variable from the local variables // “Some Value”
The invocation of the setter
Yet this could be written shorter by not storing into a local variable and directly passing the parameters. So, it becomes just:
pushing the method target onto the stack // e.g TypeBTreeNode
then loading the constant onto the stack // “Some Value”
then invoking the setter
I know that in other languages (e.g. C++) compiles are capable of such optimizations.
In Java, the HotSpot optimizer is responsible for such magic during runtime.
However, as far as I understand the docs, HotSpot only kicks in after the 500ths method call (client VM).
Questions:
Do I understand correctly: if I initialize every tree only once, but do that for a large number (let’s say 10.000) of generated TreeInitializers the first byte code sequence is executed for every TreeInitializer, as they are different classes with different methods and every method is called just once?
I suspect a significant speed up rewriting the genreator using no locals, as I am saving about a third of byte code instructions and possibly expensive loads of the variables. I know that this is hard to tell without measuring, but altering the generators code is non-trivial, so would you think it is worth a try?
Upvotes: 3
Views: 587
Reputation: 43987
Before optimizing, the JVM runs your code byte-by-byte and profiles its behavior. Based on this observation, it will compile your code to machine code. For this reason, it is difficult to give general advice for this. You should however only treat your byte code as a general abstraction, not as a performance fundamental.
A few rules of thumb:
Upvotes: 2
Reputation: 1861
The first rule of Optimize Club is "don't optimize." That said...
There is already no point in assigning a value to a local (stack) variable only to reference it once. If I was reviewing this code, I would have the author remove the assignment and just pass results of get...()
to add()
.
This is not a "premature optimization" but a code simplification (code quality) issue. The fact that it eliminates some byte codes is usually not a consideration either, as the JIT compiler will optimize the code at run time. In this case, because these initializers sound like they will only be run once, the threshold for this optimization will likely never be met, so there will be value in eliminating the unnecessary stack assign and load.
Upvotes: 0
Reputation: 5215
Removing temporary/stack variables like this is almost always premature optimization. Your processor can handle hundreds of millions of these instructions per second; meanwhile, if you're initializing tens of thousands of anything, your program is probably going to be blocking at some point waiting on memory allocation.
My advise is always going to be to hold off on optimizations until you've profiled your code. In the meantime, write code to be as easy-to-read as possible, so that when you do need to come back and modify something, it's easy to find the places that need to be updated.
Upvotes: 2