learn in Java
learn in Java

Reputation: 13

why jvm -XX:+EliminateAllocations fail

OnStackTest.java

public class OnStackTest {

    public static void alloc() {
        User u = new User();
        u.id = 5;
        u.name = "test";
    }
    public static void main(String[] args) throws InterruptedException {
        long b = System.currentTimeMillis();
        for (int i = 0; i < 100000000; i++) {
            Thread.sleep(50);
            alloc();
        }
        long e = System.currentTimeMillis();
        System.out.println(e - b);
    }
}

User.java

public class User {
    public int id = 0;

    public String name = "";

    public User() {
    }
    public User(int id, String name) {
        this.id = id;
        this.name = name;
    }
}

JVM flags -server -Xmx10m -Xms10m -XX:+DoEscapeAnalysis -XX:+PrintGC -XX:-UseTLAB -XX:+EliminateAllocations

use jmap -histo

It is found that the user object has been created all the time on the heap. In theory, we should not replace the user object with scalar, and do not create the object on the heap?

Upvotes: 1

Views: 158

Answers (1)

apangin
apangin

Reputation: 98294

DoEscapeAnalysis and EliminateAllocations flags are enabled by default - there is no need to set them explicitly.

EliminateAllocations flag is specific to C2 compiler, it is declared in c2_globals.hpp. But in your test the method is not even compiled by C2 for a long time. Add -XX:+PrintCompilation flag to make sure:

    ...
   1045   84       3       java.lang.StringBuffer::<init> (6 bytes)
   1045   85  s    3       java.lang.StringBuffer::toString (36 bytes)
   1045   86       3       java.util.Arrays::copyOf (19 bytes)
  15666   87     n 0       java.lang.Thread::sleep (native)   (static)
  15714   88       3       OnStackTest::alloc (20 bytes)
 311503   89       4       OnStackTest::alloc (20 bytes)
 311505   88       3       OnStackTest::alloc (20 bytes)   made not entrant

This shows that alloc is compiled by C1 (tier 3) after 15 seconds. A method needs to be called several thousands of times before it is considered for re-compilation by C2. Given 50 ms delay between iterations, this does not happen soon enough. In my experiment, alloc is compiled by C2 only after 5 minutes of running.

C2-compiled method no longer contains allocations.
I verified this with -XX:CompileCommand="print,OnStackTest::alloc"

  # {method} {0x0000000012de2bc0} 'alloc' '()V' in 'OnStackTest'
  #           [sp+0x20]  (sp of caller)
  0x0000000003359fc0: sub     rsp,18h
  0x0000000003359fc7: mov     qword ptr [rsp+10h],rbp  ;*synchronization entry
                                                ; - OnStackTest::alloc@-1 (line 4)

  0x0000000003359fcc: add     rsp,10h
  0x0000000003359fd0: pop     rbp
  0x0000000003359fd1: test    dword ptr [0df0000h],eax
                                                ;   {poll_return}
  0x0000000003359fd7: ret

BTW, I suggest to use JMH for such kind of tests. Otherwise it's too easy to fall into one of common benchmarking pitfalls. Here is a similar question that also tries to measure the effect of allocation elimination, but does it wrong.

Upvotes: 3

Related Questions