Java Large Object storage - Protocol buffers, MemoryMappedFiles

We have Java program with Large Objects of Tree structure, ArrayList and MultiMaps.

The problem I'm having is, we have allocated 3GB of heap memory but it is still running out of space.

I'm wondering if anyone here can suggest a way to store these objects outside heap and read the chunks of data back into java program on need basis for each processing call. I'm interested to store them in files and not database for other reasons.

I came across 'Memory Mapped File' and some one suggested "Protocol Buffers" on a related question, these are alien concepts to me at the moment, and wondering if there is an easy way. I also couldn't find good examples on both of these concepts.

Would really appreciate your help on this.

Performance is very important consideration and I'm aware of JVM heap allocations but I'm not looking for increasing JVM heap size.

Upvotes: 1

Answers (3)

Kenton Varda

Reputation: 45171

Protocol Buffers does not work well with memory-mapped files, because the file contains the encoded data, which must first be decoded before you can use it. This decoding step generates heap objects. You might be able to use Protobufs with memory-mapped files if you split the file into lots of small messages which you decode on-demand when you need them, but then immediately discard the decoded versions. But, you may waste a lot of time repeatedly decoding the same data if you aren't careful.

Cap'n Proto is a newer format that is very similar to Protocol Buffers but is explicitly designed to work with memory-mapped files. The on-disk format is designed such that it can be used in-place without a decoding step. We are working on a Java version which should be ready for production use within a few weeks.

(Disclosure: I'm the creator of Cap'n Proto, and also was previously the maintainer of Protocol Buffers at Google.)

Upvotes: 1

maaartinus

Reputation: 46422

You may be able to use immutable collections from Guava, they're usually less memory hungry.

You may be able to use String.intern if strings take a fair portion of your memory.

You may save a lot using trove4j if you have a lot of boxed primitives.

You may do some small tricks like using smaller datatypes, etc....

But your really should make your office get more memory before wasting your time with computers having as much RAM as a smartphone!

Upvotes: 0

Peter Lawrey

Reputation: 533530

You might consider storing data in something like Chronicle Map. This uses off heap memory and can be stored and accessed without creating any garbage. This allows you to reduce the heap size but you still need to buy a sensible about of memory. I would suggest you consider having at least 32 GB of memory whether you use on heap or off heap for larger datasets.

there is no reason i have to go for exotic solutions

In that case, stick to an on heap solution. You can buy 16 GB of memory for around $200.

I'm not looking for increasing JVM heap size.

Ask yourself how much time/money you are willing to invest to avoid increasing the heap. You can certainly do this but to save 4 GB I wouldn't spend a day on this. To Save 40 GB or 400 GB or 4 TB that is a different story.

Upvotes: 1

Java Large Object storage - Protocol buffers, MemoryMappedFiles

Answers (3)

Related Questions