RoundPi
RoundPi

Reputation: 5947

How to use less memory when assign big new String

Java String is immutable so

when you create a string, a block of memory is assigned for it in the heap, and when you change its value, a new block of memory is created for that string, and the old one becomes eligible for garbage collection, for example

String str = func1_return_big_string_1()"; //not literal
String str= func2_return_big_string_2()"; //not literal

But as garbage collection takes time to kick in so we are practically have memory in heap containing both big string 1 & 2. They can be a issue for me if this happens a lot.

Is there a way to make big string 2 to use the same location in memory of string 1 so we don't need have extra space when we assign big string 2 to str.

Edit: Thanks for all the input and in the end I realized I shouldn't expecting java code to behave like c++ code(i.e, different memory footprint). I have wrote a c++ 11 demo which works as expected, biggest memory footprint is around 20M (biggest file I was trying to load) with rvalue reference and move assignment operator all kick in as expected. Below demo done in VS2012 with c++ 11.

#include "stdafx.h"
#include <iostream>
#include <string>
#include <vector>
#include <fstream>
#include <thread>
using namespace std;

string readFile(const string &fileName)
{
    ifstream ifs(fileName.c_str(), ios::in | ios::binary | ios::ate);

    ifstream::pos_type fileSize = ifs.tellg();
    ifs.seekg(0, ios::beg);

    vector<char> bytes(fileSize);
    ifs.read(&bytes[0], fileSize);

    return string(&bytes[0], fileSize);
}

class test{
public:
    string m_content;
};

int _tmain(int argc, _TCHAR* argv[])
{
    string base("c:\\data");
    string ext(".bin");
    string filename;
    test t;
    //std::this_thread::sleep_for(std::chrono::milliseconds(5000));
    cout << "about to start" << endl;
    for(int i=0; i<=50; ++i) {
        cout << i << endl;
        filename = base + std::to_string(i) + ext;
        //rvalue reference & move assignment operator here
        //so no unnecessary copy at all
        t.m_content = readFile(filename);
        cout << "szie of content" << t.m_content.length() << endl;
    }
    cout << "end" << endl;
    system("pause");
    return 0;
}

Upvotes: 0

Views: 403

Answers (5)

Andrey Chaschev
Andrey Chaschev

Reputation: 16526

I have just found a MutableString implementation. It is available in Maven Central. Here is an extract from their JavaDoc page:

  • Mutable strings occupy little space— their only attributes are a backing character array and an integer;
  • their methods try to be as efficient as possible: for instance, if some limitation on a parameter is implied by limitation on array access, we do not check it explicitly, and Bloom filters are used to speed up multi-character substitutions;
  • they let you access directly the backing array (at your own risk);
  • they implement CharSequence, so, for instance, you can match or split a mutable string against a regular expression using the standard Java API;
  • they implement Appendable, so they can be used with Formatter and similar classes;

UPDATE

You can utilize Appendable interface of this MutableString to read a file with almost zero-memory overhead (8KB, which is the default buffer size in Java). With Guava's CharStreams.copy it looks like this:

MutableString str = new MutableString((int) file.length());
CharStreams.copy(Files.newReaderSupplier(file, Charset.defaultCharset()), str);
System.out.println(str);

Full working example.

Upvotes: 1

Martijn Courteaux
Martijn Courteaux

Reputation: 68907

In order to avoid having both the old and new String at the same time in memory, you can explicitly allow the GC to clean it up by assigning null to the variable:

String str;
str = func1_return_big_string_1();
str = null; // Now, GC can clean, when it needs extra memory for the String.
str = func2_return_big_string_2();

UPDATE: To support my claim, I wrote a test case that proves I'm right: http://ideone.com/BwGfSN. The code demonstrates the difference between (using the Finalizer):

GCTest test;
// Without the null assignment
test = create(0);
test = create(1);
test = null;
System.gc();

try {Thread.sleep(10);} catch (Exception e){}
System.out.println();

// With the null assignment
test = create(2);
test = null;
test = create(3);
test = null;
System.gc();

Upvotes: 1

Andrey Chaschev
Andrey Chaschev

Reputation: 16526

I see several options:

  1. Use char[].
  2. Copy StringBuilder into your version MyStringBuilder with a public reusable buffer. The major disadvantage is that it lacks regexes. That's what I did when I needed to boost performance.
  3. Hack for JDK <=6: there is a protected constructor to reuse strings/wrap char buffers. It's not there anymore for JDK 7+. One needs to be really cautious with this, and it's not a problem once you have C/C++ background.
  4. Copy String into the MutableString with a public reusable buffer. I don't think there would a problem adding your custom regex matcher as there are a plenty of them available.

Upvotes: 2

dkatzel
dkatzel

Reputation: 31658

It shouldn't really matter for non-interned Strings. If you start running out of memory, the garbage collector will remove any objects that are no longer referenced.

Interned Strings are much harder to collect, see Garbage collection of String literals for details

EDIT A non-interned String is just like a normal object. Once there are no more references to it, it will get garbage collected.

if str is the only reference left pointing to the original String and str is changed to point to something else, then the original String is eligible for garbage collection. So you no longer have to worry about running out of memory because the JVM will collect it if memory is required.

Upvotes: 1

user2056437
user2056437

Reputation:

Use StringBuffer, StringBuffer.append()

Upvotes: 2

Related Questions