payloc91
payloc91

Reputation: 3809

Is it safe to store an Entry from a Map? Can it cause memory leaks?

I'm running into this code (adapted with dummy data):

public Map<String, Integer> queryDatabase() {
    final Map<String, Integer> map = new TreeMap<>();
    map.put("one", 1);
    map.put("two", 2);
    // ...
    return map;
}

public Map.Entry<String, Integer> getEntry(int n) {
    final Map<String, Integer> map = queryDatabase();
    for (final Map.Entry<String, Integer> entry : map.entrySet()) {
        if (entry.getValue().equals(n)) return entry; // dummy check
    }
    return null;
}

The Entry is then stored into a newly-created object that is saved into a cache for an undefined period:

class DataBundle {
    Map.Entry<String, Integer> entry;

    public void doAction() {
    this.entry = Application.getEntry(2);
    }
}

While queryDatabase is called multiple times in a minute, the local Maps should be discarded at the consequent gc cycle. I have reason to believe though, that DataBundle keeping an Entry reference prevents the Map from being collected at all.

Besides, a java.util.TreeMap.Entry holds multiple references to siblings:

static final class Entry<K,V> implements Map.Entry<K,V> {
    K key;
    V value;
    Entry<K,V> left;
    Entry<K,V> right;
    Entry<K,V> parent;
    // ...
}

Q: Does storing a Map.Entry into a member field retain the local Map instances into memory?

Upvotes: 0

Views: 1097

Answers (2)

payloc91
payloc91

Reputation: 3809

I have written a benchmarking application and the results clearly demonstrate that the JVM is not able to collect the local Map instances if an Entry reference is kept alive.

This applies only to TreeMaps though and the reason might be that a TreeMap.Entry holds different references to its siblings.

As @OldCurmudgeon mentions,

you should not make any assumptions [and] if you wish to store Key-Value pairs derived from a Map.Entry then you should take copies

I believe that at this point, if you don't know what you are doing, whatever Map you are working with, holding a Map.Entry should be considered evil and anti-pattern.

Always opt for saving copies of Map.Entry or directly store the key and value.


Benchmarking technical data:

JVM

java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)

CPU

Caption           : Intel64 Family 6 Model 158 Stepping 9
DeviceID          : CPU0
Manufacturer      : GenuineIntel
MaxClockSpeed     : 4201
Name              : Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
SocketDesignation : LGA1151

RAM

Model Name                  MaxCapacity MemoryDevices
----- ----                  ----------- -------------
      Physical Memory Array    67108864             4

What will the benchmark do?

  • The main program will (pseudo-)query the database 500 times
  • A new instance of DataBundle will be created at each iteration an will store a random Map.Entry by calling getEntry(int)
  • queryDatabase is going to create a local TreeMap of 100_000 items at each call and getEntry will simply return one Map.Entry from the Map.
  • All DataBundle instances will be stored into an ArrayList cache.
  • How DataBundle stores the Map.Entry will be different among the benchmarks, in order to demonstrate the gc abilities to perform its duty.
  • Every 100 calls to queryDatabase the cache will be cleared: this is to see the effect of the gc in visualvm

Benchmark 1: TreeMap and storing Map.Entry - CRASH

The DataBundle class:

class DataBundle {
    Map.Entry<String, Integer> entry = null;
    public DataBundle(int i) {
        this.entry = Benchmark_1.getEntry(i);
    }
}

The benchmarking application:

public class Benchmark_1 {
    static final List<DataBundle> CACHE = new ArrayList<>();
    static final int MAP_SIZE = 100_000;

    public static void main(String[] args) throws InterruptedException {
        for (int i = 0; i < 500; i++) {
            if (i % 100 == 0) {
                System.out.println("Clearing");
                CACHE.clear();
            }
            final DataBundle dataBundle = new DataBundle(new Random().nextInt(MAP_SIZE));
            CACHE.add(dataBundle);
            Thread.sleep(500); // to observe behavior in visualvm
        }
    }

    public static Map<String, Integer> queryDatabase() {
        final Map<String, Integer> map = new TreeMap<>();
        for (int i = 0; i < MAP_SIZE; i++) map.put(String.valueOf(i), i);
        return map;
    }
    public static Map.Entry<String, Integer> getEntry(int n) {
        final Map<String, Integer> map = queryDatabase();
        for (final Map.Entry<String, Integer> entry : map.entrySet())
            if (entry.getValue().equals(n)) return entry;
        return null;
    }
}

The application can't even reach the first 100 iterations (the cache clearing) and it throws a java.lang.OutOfMemoryError:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.lang.Integer.valueOf(Integer.java:832)
    at org.payloc.benchmark.Benchmark_1.queryDatabase(Benchmark_1.java:34)
    at org.payloc.benchmark.Benchmark_1.getEntry(Benchmark_1.java:38)
    at org.payloc.benchmark.DataBundle.<init>(Benchmark_1.java:11)
    at org.payloc.benchmark.Benchmark_1.main(Benchmark_1.java:26)
Mar 22, 2018 1:06:41 PM sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
WARNING: RMI TCP Accept-0: accept loop for 
ServerSocket[addr=0.0.0.0/0.0.0.0,localport=31158] throws
java.lang.OutOfMemoryError: Java heap space
    at java.net.NetworkInterface.getAll(Native Method)
    at java.net.NetworkInterface.getNetworkInterfaces(NetworkInterface.java:343)
    at sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:86)
    at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:400)
    at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:372)
    at java.lang.Thread.run(Thread.java:748)

*** java.lang.instrument ASSERTION FAILED ***: "!errorOutstanding" with message can't create byte arrau at JPLISAgent.c line: 813

The visualvm graph clearly shows how the memory is retained despite the gc performing some activity in the background, ergo: holding a single Entry keeps the entire Map in the heap.

Benchmark_1: Heap Memory

Benchmark_1: GC Activity

Benchmark 2: HashMap and storing Map.Entry - PASS

Very same program, but instead of a TreeMap I'm using a HashMap.

The gc is able to collect the local Map instances despite the stored Map.Entry being kept into memory, (If you tried in fact to print the cache result after the benchmark you will see actual values).

visualvm graphs:

Benchmark_2: Heap Memory Benchmark_2: GC Activity

The application does not throw any memory error.

Benchmark 3: TreeMap and storing only key and value (no Map.Entry) - PASS

Still using a TreeMap, but this time, instead of the Map.Entry, I will be storing the key and the value data directly.

class DataBundle3 {
    String key;
    Integer value;

    public DataBundle3(int i) {
        Map.Entry<String, Integer> e = Benchmark_3.getEntry(i);
        this.key = e.getKey();
        this.value = e.getValue();
    }
}

Safe approach, as the application reaches the end correctly and the gc periodically cleans the maps.

Benchmark_3: Heap Memory Benchmark_3: GC Activity

Benchmark 4: TreeMap and cache of SoftReference (storing Map.Entry) - PASS

Not the best solution, but since many caching systems use java.lang.ref.SoftReference I'll post a benchmark with it.

So, still using a TreeMap, still storing the Map.Entry into DataBundle, but using a List of SoftReference<DataBundle>.

So the cache becomes:

static final List<SoftReference<DataBundle>> CACHE = new ArrayList<>();

and I save the objects via:

CACHE.add(new SoftReference<>(dataBundle));

The application completes correctly, and the gc is free to collect the maps anytime it needs to. This happens because SoftReference doesn't retain its referent (in our case the Map.Entry) from being collected.

Benchmark_4: Heap Memory Benchmark_4: GC Activity


Hope this will be useful for someone.

Upvotes: 2

OldCurmudgeon
OldCurmudgeon

Reputation: 65821

The contract for Map.Entry does not make any commitments in that area so you should not make any assumptions either.

... These Map.Entry objects are valid only for the duration of the iteration; ...

For this reason, if you wish to store Key-Value pairs derived from a Map.Entry then you should take copies.

Upvotes: 1

Related Questions