Claude
Claude

Reputation: 9980

Clojure: why is aget so slow?

In my thinking, clojure vectors have a slight performance hit compared to java arrays. As a result I thought that "conventional wisdom" was that for those performance-critical parts of your code, you'd be better off using java arrays.

My tests however suggest that this is not true:

Clojure 1.3.0
user=> (def x (vec (range 100000)))
#'user/x
user=> (def xa (int-array x))
#'user/xa
user=> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ s (nth x i))) s)))
"Elapsed time: 16.551 msecs"
4999950000
user=> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ s (aget xa i))) s)))
"Elapsed time: 1271.804 msecs"
4999950000

As you can see, the aget adds about 800% time to this addition. Both methods still are way slower than native java though:

public class Test {                                                                                                                                                                                                                                                                                                           
    public static void main (String[] args) {                                                                                                                                                                                                                                                                                 
        int[] x = new int[100000];                                                                                                                                                                                                                                                                                            
        for (int i=0;i<100000;i++) {                                                                                                                                                                                                                                                                                          
            x[i]=i;                                                                                                                                                                                                                                                                                                           
        }                                                                                                                                                                                                                                                                                                                     
        long s=0;                                                                                                                                                                                                                                                                                                             
        long end, start = System.nanoTime();                                                                                                                                                                                                                                                                                  
        for (int i=0;i<100000;i++) {                                                                                                                                                                                                                                                                                          
            s+= x[i];                                                                                                                                                                                                                                                                                                         
        }                                                                                                                                                                                                                                                                                                                     
        end = System.nanoTime();                                                                                                                                                                                                                                                                                              
        System.out.println((end-start)/1000000.0+" ms");                                                                                                                                                                                                                                                                      
        System.out.println(s);                                                                                                                                                                                                                                                                                                
    }                                                                                                                                                                                                                                                                                                                         
}                              

> java Test
1.884 ms
4999950000

So, should my conclusion be that aget is 80 times slower than nth, and about 800 times slower than []-access in java?

Upvotes: 4

Views: 2258

Answers (3)

NielsK
NielsK

Reputation: 6956

Seems no type hints are needed at all, Clojure optimizes nicely out of the box.

When a polyadic function needs to be done over a collection, just use apply and the function. When you need a function applied to elements in a collection and the result stored in an accumulator, use reduce. In this case, both apply.

=> (def xa (into-array (range 100000)))
#'user/xa

=> (time (apply + xa))
"Elapsed time: 12.264753 msecs"
4999950000

=>(time (reduce + xa))
"Elapsed time: 2.735339 msecs"
4999950000

And even simpler evens out these differences as well, though slightly slower than the above best case:

=> (def xa (range 100000))
#'user/xa

=> (time (apply + xa))
"Elapsed time: 4.547634 msecs"
4999950000

=> (time (reduce + xa))
"Elapsed time: 4.506572 msecs"

Just try write the simplest code possible, and only if that's not fast enough, optimize.

Upvotes: 4

sw1nn
sw1nn

Reputation: 7328

I suspect this is down to reflection and autoboxing of the primitive types by the aget function....

Luckily aget/aset have performant overloads for primitive arrays that avoid the reflection and just do a direct array[i] access (See here and here).

You just need to pass a type hint to pick up the right function.

(type xa)
[I    ; indicates array of primitive ints

; with type hint on array
;
(time (loop [i 0 s 0] 
        (if (< i 100000) (recur (inc i) 
          (+ s (aget ^ints xa i))) s))) 
"Elapsed time: 6.79 msecs"
4999950000

; without type hinting
;
(time (loop [i 0 s 0] 
        (if (< i 100000) (recur (inc i) 
          (+ s (aget xa i))) s)))
"Elapsed time: 1135.097 msecs"
4999950000

Upvotes: 9

Arthur Ulfeldt
Arthur Ulfeldt

Reputation: 91554

it looks like reflection is washing out all your test's accuracy:

user> (set! *warn-on-reflection* true)
true
user> (def x (vec (range 100000)))
#'user/x
user>  (def xa (int-array x))
#'user/xa
user>  (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ s (nth x i))) s)))
NO_SOURCE_FILE:1 recur arg for primitive local: s is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: s
"Elapsed time: 12.11893 msecs"
4999950000
user> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ s (aget xa i))) s)))
Reflection warning, NO_SOURCE_FILE:1 - call to aget can't be resolved.
NO_SOURCE_FILE:1 recur arg for primitive local: s is not matching primitive, had: Object, needed: long
Auto-boxing loop arg: s
Reflection warning, NO_SOURCE_FILE:1 - call to aget can't be resolved.
"Elapsed time: 2689.865468 msecs"
4999950000
user> 

the second one just happens to have more reflection in it.

When running this kind of benchmark be sure to run it many times to get the hotSpot compiler warmed up

user> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ s (aget xa i))) (long s))))
"Elapsed time: 3135.651399 msecs"
4999950000
user> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ (long s) (aget xa i))) (long s))))
"Elapsed time: 1014.218461 msecs"
4999950000
user> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ (long s) (aget xa i))) (long s))))
"Elapsed time: 998.280869 msecs"
4999950000
user> (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ (long s) (aget xa i))) (long s))))
"Elapsed time: 970.17736 msecs"
4999950000

in this case a few runs dropped it down to 1/3 the original time (though reflection is still the main problem here)

if I warm them both up with dotimes the results improve a lot:

(dotimes [_ 1000]  (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ s (nth x  i))) s))))
"Elapsed time: 3.704714 msecs"

(dotimes [_ 1000] (time  (loop [i 0 s 0] (if (< i 100000) (recur (inc i) (+ (long s) (aget xa i))) (long s)))))
"Elapsed time: 936.03987 msecs"

Upvotes: 4

Related Questions