Reputation: 5947
Java String is immutable so
when you create a string, a block of memory is assigned for it in the heap, and when you change its value, a new block of memory is created for that string, and the old one becomes eligible for garbage collection, for example
String str = func1_return_big_string_1()"; //not literal
String str= func2_return_big_string_2()"; //not literal
But as garbage collection takes time to kick in so we are practically have memory in heap containing both big string 1 & 2. They can be a issue for me if this happens a lot.
Is there a way to make big string 2 to use the same location in memory of string 1 so we don't need have extra space when we assign big string 2 to str.
Edit: Thanks for all the input and in the end I realized I shouldn't expecting java code to behave like c++ code(i.e, different memory footprint). I have wrote a c++ 11 demo which works as expected, biggest memory footprint is around 20M (biggest file I was trying to load) with rvalue reference and move assignment operator all kick in as expected. Below demo done in VS2012 with c++ 11.
#include "stdafx.h"
#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <thread>
using namespace std;
string readFile(const string &fileName)
{
ifstream ifs(fileName.c_str(), ios::in | ios::binary | ios::ate);
ifstream::pos_type fileSize = ifs.tellg();
ifs.seekg(0, ios::beg);
vector<char> bytes(fileSize);
ifs.read(&bytes[0], fileSize);
return string(&bytes[0], fileSize);
}
class test{
public:
string m_content;
};
int _tmain(int argc, _TCHAR* argv[])
{
string base("c:\\data");
string ext(".bin");
string filename;
test t;
//std::this_thread::sleep_for(std::chrono::milliseconds(5000));
cout << "about to start" << endl;
for(int i=0; i<=50; ++i) {
cout << i << endl;
filename = base + std::to_string(i) + ext;
//rvalue reference & move assignment operator here
//so no unnecessary copy at all
t.m_content = readFile(filename);
cout << "szie of content" << t.m_content.length() << endl;
}
cout << "end" << endl;
system("pause");
return 0;
}
Upvotes: 0
Views: 403
Reputation: 16526
I have just found a MutableString implementation. It is available in Maven Central. Here is an extract from their JavaDoc page:
- Mutable strings occupy little space— their only attributes are a backing character array and an integer;
- their methods try to be as efficient as possible: for instance, if some limitation on a parameter is implied by limitation on array access, we do not check it explicitly, and Bloom filters are used to speed up multi-character substitutions;
- they let you access directly the backing array (at your own risk);
- they implement
CharSequence,
so, for instance, you can match or split a mutable string against a regular expression using the standard Java API;- they implement
Appendable
, so they can be used withFormatter
and similar classes;
UPDATE
You can utilize Appendable
interface of this MutableString
to read a file with almost zero-memory overhead (8KB, which is the default buffer size in Java). With Guava's CharStreams.copy it looks like this:
MutableString str = new MutableString((int) file.length());
CharStreams.copy(Files.newReaderSupplier(file, Charset.defaultCharset()), str);
System.out.println(str);
Upvotes: 1
Reputation: 68907
In order to avoid having both the old and new String at the same time in memory, you can explicitly allow the GC to clean it up by assigning null
to the variable:
String str;
str = func1_return_big_string_1();
str = null; // Now, GC can clean, when it needs extra memory for the String.
str = func2_return_big_string_2();
UPDATE: To support my claim, I wrote a test case that proves I'm right: http://ideone.com/BwGfSN. The code demonstrates the difference between (using the Finalizer):
GCTest test;
// Without the null assignment
test = create(0);
test = create(1);
test = null;
System.gc();
try {Thread.sleep(10);} catch (Exception e){}
System.out.println();
// With the null assignment
test = create(2);
test = null;
test = create(3);
test = null;
System.gc();
Upvotes: 1
Reputation: 16526
I see several options:
char[]
. StringBuilder
into your version MyStringBuilder
with a public reusable buffer. The major disadvantage is that it lacks regexes. That's what I did when I needed to boost performance.String
into the MutableString
with a public reusable buffer. I don't think there would a problem adding your custom regex matcher as there are a plenty of them available.Upvotes: 2
Reputation: 31658
It shouldn't really matter for non-interned Strings. If you start running out of memory, the garbage collector will remove any objects that are no longer referenced.
Interned Strings are much harder to collect, see Garbage collection of String literals for details
EDIT A non-interned String is just like a normal object. Once there are no more references to it, it will get garbage collected.
if str
is the only reference left pointing to the original String and str
is changed to point to something else, then the original String is eligible for garbage collection. So you no longer have to worry about running out of memory because the JVM will collect it if memory is required.
Upvotes: 1