Reputation: 1449
I am searching about the memory leak that was caused by subString() method in java. I have read few articles in the internet and got some knowledge but I was ended up with a little confusion. One article has mentioned "substring method inside String class, calls String (int offset, int count, char value []) constructor to create new String object. What is interesting here is, value[], which is the same character array used to represent original string". This is clear to me.
It also mentioned "If the original string is very long, and has array of size 1GB, no matter how small a substring is, it will hold 1GB array. This will also stop original string to be garbage collected, in case if doesn't have any live reference".
The link to the article is here {http://javarevisited.blogspot.sg/2011/10/how-substring-in-java-works.html}
I cannot understand this. Please anybody explain me how does it stop the original string to be garbage collected if it does't have any live reference?
According to my little understanding, in java any object which doesn't have direct or indirect reference will be eligible for garbage collector to collect. If subString method uses the same character array used to represent original string how can the original String stay without being garbage collected ?
Likewise I want to know does the memory leak is caused due to use the same character array used to represent original string because the new string {String returned by subString method} would be very little compared with the original String or since the original string is still staying without being garbage collected?
Please anybody help me to clarify this
Upvotes: 0
Views: 859
Reputation: 13
In accord with @chiastic-security answer, also find this reference for detailed explanation on Java 8 treatment for memory leak prevention
http://mrgyani.com/java-interview/memory-leak-problem-in-java-substring/
Upvotes: 0
Reputation: 20520
When you allocate anything in Java, and in this case a String
, it stays in memory for as long as you still have a way of accessing it. If you write
String s = "blah";
then s
is a reference to that String
. As long as there's at least one reference to the String
, then the JVM knows it needs to be kept around in memory. Once there are no more references, then the JVM can clean it up and reclaim the memory when it needs to.
If you write
String s = "myverylongstring...somemore...someotherstuff";
then you've used up quite a lot of memory. Internally, this will be represented as a char[]
containing all of the characters of your String
, with some extra gubbins round the edge to allow you to access it using the String
API.
Now, what happens if you then write
s = s.substring(20,30);
so that s
is now a reference just to part of the original String
? There are two ways the JVM might deal with it. One is to keep the whole char[]
in memory, but just change which bit of the char[]
it's prepared to let you get access to. The other is to copy the relevant bit of the char[]
into a new array, and then get rid of all references to the old one.
The first approach is quick, because it doesn't need any copying. But if you now only want a small bit of your huge String
, then the whole of the big String
can't ever be fully cleaned up because the whole of its char[]
is still in memory being used. The fact that only part of it is being used means that none of its memory can be reclaimed.
The second approach is slower, because it needs a copy operation. But it does mean that the previous String
can be cleaned up in its entirety, because the old char[]
isn't being used any more.
Java used to follow the first approach, but in recent times (Java 7 onwards) it's switched to using the second. So memory leaks used to be a problem with the substring()
call, but aren't any longer.
Upvotes: 6
Reputation: 691625
The wording is a bit sloppy. The original String instance will be garbage-collected. But its internal array of characters won't, since it's still referenced by the substring.
In an extreme case, you could thus have a String instance of length 1 referencing a 1GB array.
Before:
s1 ---------> object --> char[] (1GB large)
- offset
- count
substring is called:
s1 ---------> object --> char[] (1GB large)
- offset ^
- count |
|
substring --> object ---/
- offset
- count
s1 goes out of scope: object can be garbage collected
char[] (1GB large)
^
|
substring --> object ---/
- offset
- count
Upvotes: 1