Reputation: 880
WE have the behaviour that Java compiler will use the same instance if use a string constant
String a = "abc";
String b = "abc";
//a == b
String c = new String("abc");
// c is a brand new object on the heap;
Why doesn't java compiler optimize out the new String and substitute it with the equivalent assignment ? Were there some deep design decisions or it is just a coincidence? Can we expect a different JVM or compiler to be more aggressive and actually replace the heap instances of immutable objects with a well-known static ones ? While the String is the most notorious example, we could have the same behaviour for Integer , for example.
Upvotes: 0
Views: 104
Reputation: 109547
First of all, the String(String)
"copy" constructor stems from the initial days and is an anomaly. Maybe because of String.intern()
which does a bit of copy prevention, as are the constants "..."
. It is never needed, as String is an immutable final
class.
For Integer
there is Integer.valueOf(int)
that uses a cache of instants which per default holds -128 upto 127.
Despite the very competent compiler development team involved, the java byte code compiler compiles very naive. But then, on byte code to machine code, some nice things may happen. For instance object not created as such on the heap, but on the stack.
Simplistic compilation at least is less likely to contain errors in the dataflow analysis of a smart trick. (It also provides a good reason for good code style.)
An example:
List<String> list = ...
String[] array1 = list.toArray(new String[0]);
String[] array2 = list.toArray(new String[list.size()]);
toArray
needs an actual array instance, as because of type erasure the List list
no longer knows it contains String
s.
Historically as optimization one could pass an array of fitting size (here the version with list.size()
) which would then be returned. More optimal and faster, and still some style checker mark the first version. However actually the first version is faster as an other array byte cdoe instantiation is used, and array1 will be fractionally faster generated.
The same story on division by some numbers. In C there are many compiler optimisations involving faster shifts. This is (partly) done in Java in the byte code to machine code compilation, a more logical place for these optimisations.
I personally think an optimizing byte code compiler would be nice, maybe something for university projects. However it might not be justifiable just for code improvements, like not using .equals
for enum values.
Upvotes: 1
Reputation: 3097
String
s are a bit different from other objects as they are widely used and often act as "native types" (liek int
, float
, ...) but are in fact arrays (i.e. not a fixed memory size). Using memory to store the same content over and over could cause the process to waste memory on the same content (and that has happened to me before). String interning was introduced to save developers the hassle of writing their own String
pool.
The compiler interns String
constants automatically. Integer
s can have the same mechanism but you need to explicitely call it through Integer.valueOf(int)
.
In your case, paraphrasing @BenjaminUrquhart, you explicitely told it to create a new instance by calling new
, and new
is required to create a new instance. There are situations where that is required, e.g. when you call obj.clone()
, you expect a new object, not a new reference to obj
.
Note that, in the case of clone()
, returning a new instance does not sound mandatory, but rather a "general intent" (quoting Javadoc):
[
clone()
] Creates and returns a copy of this object. The precise meaning of "copy" may depend on the class of the object. The general intent is that, for any object x, the expression:x.clone() != x
will be true, (...)
So stricty speaking, it appears that you could return the same instance in that case, but it is not considered "good practice" (or at least, not something expected).
I guess it could have to do with arrays shallow copies, where the array itself is a different instance, but each object is a reference to objects in the original array instance (see JLS §10.7) so the return copy is not a totally independent copy of the original object.
Upvotes: 0