Ingulit
Ingulit

Reputation: 1141

Java: Fastest way to make a local copy of an initial ArrayList?

My code requires me to create a large (301x301x19 items) ArrayList that has some initial values (some are 0, some are 1, etc) every time I call a function. The starting values are always the same and need to be loaded into the array every time the function is called so that the function has its own copy of these initial values to mess with.

Originally I was recalculating the array every time the function was called, but that proved to be laughably slow; instead, I am now calculating the initial array only once and am making local copies of it every time the function is called (so that I can change the values without changing the initial array's values).

However, copying the array is still proving to be prohibitively slow (well over 3/4ths of the computation time is spent just copying this array). I have tried the following:

// oldList is an ArrayList<Byte>
ArrayList<Byte> newList = new ArrayList<Byte>(oldList);

// oldList is an ArrayList<Byte>
ArrayList<Byte> newList = new ArrayList<Byte>();
newList.addAll(oldList);

// oldList is a Byte[]
ArrayList<Byte> newList = new ArrayList<Byte>(Arrays.asList(oldList));

All of these methods are simply too slow for my application; is there any faster technique to do this or am I out of luck?

Upvotes: 1

Views: 1216

Answers (1)

Chris K
Chris K

Reputation: 11917

In summary:

  1. Aim to design out the need to copy so many large data structures (a hard problem, I know)
  2. Avoid pointer chasing, use arrays rather than ArrayLists. If your objects contain other objects, try to replace them with primitives. The ultimate here is to reduce to an array of primitives, such as a byte array
  3. Compact your data structures, use arrays, smaller types; the goal is to gain the same amount of benefit from copying less actual bytes
  4. Use System.arrayCopy
  5. If you still want to go still faster, then take memory layout and responsibility away from the JVM and use sun.misc.Unsafe directly (otherwise known as 'running with scissors')

Changing to a more easily copied data structure, and using System.arraycopy is going to be about as fast as you can get with the approach that you outlined in your question.

System.arraycopy is implemented as a native call. Most JVM providers will have prepared a native version that makes use of native instructions to accelerate the memory copying.

Unfortunately copying large regions of memory has unintended side effects within the JVM, mostly around the Garbage Collector.

  1. during a mem copy the JVM cannot access a safe point, this prevents STW GC from starting
  2. GC not starting causes threads that did pay attention to the safe point to wait longer, creating weird stalls on threads that have nothing to do with this work
  3. a large array may not fit within the TAB (a thread local buffer used to accelerate object allocations), meaning that object allocation slows down as it enters special case code
  4. large objects increase the likelihood of premature tenuring during GC cycles, which increases the frequency of more costly older gen/full GCs (in oppose to the cheaper young gen GCs)

    • NB: for the above effects to be seen, we must be talking about very high rates of allocations and discardings. Most algorithms that do a few allocations and copies here and there will not see these problems; modern JVMs can even cope with fairly high rates, these problems do not occur until a threshold is exceeded and plates that we had been spinning on poles start to hit the floor.

Upvotes: 4

Related Questions