Reputation: 1709
While considering alternatives for Java for a distributed/concurrent/failover/scalable backend environment I discovered Erlang. I've spent some time on books and articles where nearly all of them (even Java addicted guys) says that Erlang is a better choice in such environments, as many useful things are out of the box in a less error prone way.
I was sure that Erlang is faster in most cases mainly because of a different garbage collection strategy (per process), absence of shared state (b/w threads and processes) and more compact data types. But I was very surprised when I found comparisons of Erlang vs Java math samples where Erlang is slower by several orders, e.g. from x10 to x100.
Even on concurrent tasks, both on several cores and a single one.
What's the reasons for that? These answers came to mind:
If that's because these are very specific math algorithms, can anybody show more real/practice performance tests?
UPDATE: I've got the answers so far summarizing that Erlang is not the right tool for such specific "fast Java case", but the thing that is unclear to me - what's the main reason for such Erlang inefficiency here: dynamic typing, GC or poor native compiling?
Upvotes: 34
Views: 26025
Reputation: 6040
As pointed in other answers - Erlang is designed to solve effectively real life problems, which are bit opposite to benchmark problems.
But I'd like to enlighten one more aspect - pithiness of erlang code (in some cases means rapidness of development), which could be easily concluded, after comparing benchmarks implementations.
For example, k-nucleotide benchmark:
Erlang version: http://benchmarksgame.alioth.debian.org/u64q/program.php?test=knucleotide&lang=hipe&id=3
Java version: http://benchmarksgame.alioth.debian.org/u64q/program.php?test=knucleotide&lang=java&id=3
If you want more real-life benchmarks, I'd suggest you Comparing C++ And Erlang For Motorola Telecoms Software
Upvotes: 8
Reputation: 15699
Erlang was not built for math. It was built with communication, parallel processing and scalability in mind, so testing it for math tasks is a bit like testing if your jackhammer gives you refreshing massage experience.
That said, let's offtop a little:
If you want Erlang-style programming in JVM, take a look at Scala Actors or Akka framework or Vert.x.
Upvotes: 34
Reputation: 1813
I took interest to this as some of the benchmarks are a perfect fit for erlang, such as gene sequencing. So on http://benchmarksgame.alioth.debian.org/ the first thing I did was look at reverse-complement implementations, for both C and Erlang, as well as the testing details. I found that the test is biased because it does not discount the time it takes erlang to start the VM /w the schedulers, natively compiled C is started much faster. The way those benchmarks measure is basically:
time erl -noshell -s revcomp5 main < revcomp-input.txt
Now the benchmark says Java took 1.4 seconds and erlang /w HiPE took 11. Running the (Single threaded) Erlang code took me 0.15 seconds, and if you discount the time it took to start the vm, the actual workload took only 3000 microseconds (0.003 seconds).
So I have no idea how that is benchmarked, if its done 100 times then it makes no sense as the cost of starting the erlang VM will be x100. If the input is a lot longer than given, it would make sense, but I see no details on the webpage of that. To make the benchmarks more fair for Managed languages, have the code (Erlang/Java) send a Unix signal to the python (that is doing the benchmarking) that it hit the startup function.
Now benchmark aside, the erlang VM essentially just executes machine code at the end, as well as the Java VM. So there is no way a math operation would take longer in Erlang than in Java.
What Erlang is bad at is data that needs to mutate often. Such as a chained block cypher. Say you have the chars "0123456789", now your encryption xors the first 2 chars by 7, then xors the next two chars by the result of the first two added, then xors the previous 2 chars by the results of the current 2 subtracted, then xors the next 4 chars.. etc
Because objects in Erlang are immutable this means that the entire char array needs to be copied each time you mutate it. That is why erlang has support for things called NIFS which is C code you can call into to solve this exact problem. In fact all the encryption (ssl,aes,blowfish..) and compression (zlib,..) that ship with Erlang are implemented in C, also there is near 0 cost associated with calling C from Erlang.
So using Erlang you get the best of both worlds, you get the speed of C with the parallelism of Erlang.
If I were to implement the reverse-complement in the FASTEST way possible, I would write the mutating code using C but the parallel code using Erlang. Assuming infinite input, I would have Erlang split on the >
<<Line/binary, ">", Rest/binary>> = read_stream
Dispatch the block to the first available scheduler via round robin, consisting of infinite EC2 private networked hidden nodes, being added in real time to the cluster every millisecond.
Those nodes then call out to C via NIFS for processing (C was the fastest implementation for reverse-compliment on alioth website), then send the output back to the node master to send out to the inputer.
To implement all this in Erlang I would have to write code as if I was writing a single threaded program, it would take me under a day to create this code.
To implement this in Java, I would have to write the single threaded code, I would have to take the performance hit of calling from Managed to Unmanaged (as we will be using the C implementation for the grunt work obviously), then rewrite to support 64 cores. Then rewrite it to support multiple CPUS. Then rewrite it again to support clustering. Then rewrite it again to fix memory issues.
And that is Erlang in a nutshell.
Upvotes: 9
Reputation: 177
The Erlang solution uses ETS, Erlang Term Storage, which is like an in-memory database running in a separate process. Consequent to it being in a separate process, all messages to and from that process must be serialized/deserialized. This would account for a lot of the slowness, I should think. For example, if you look at the "regex-dna" benchmark, Erlang is only slightly slower than Java there, and it doesn't use ETS.
Upvotes: 3
Reputation: 3511
The fact that erlang has to allocate memory for every value whereas in java you will typically reuse variables if you want it to be fast, means it will always be faster for 'tight loop' bench marks.
It would be interesting to benchmark a java version using the -client flag and boxed primitives and compare that to erlang.
I believe using hipe is unfair since it is not an active project. I would be interested to know if any mission critical software is running on this.
Upvotes: 2
Reputation: 20059
I don't know anything about Erlang, but this seems to be a compare apples to oranges approach anyways. You must be aware that considerable effort was spent over more than a decade to improve java preformance to the point where it is today.
Its not surprising (to me) that a language implementation done by volunteers or a small company can not outmatch that effort.
Upvotes: -8
Reputation: 91902
Benchmarks are never good for saying anything else than what they are really testing. If you feel that a benchmark is only testing primitives and a classic threading model, that is what you get knowledge about. You can now with some confidence say that Java is faster than Erlang on mathematics on primitives as well as the classic threading model for those types of problems. You don't know anything about the performance with large number of threads or for more involved problems because the benchmark didn't test that.
If you are doing the types of math that the benchmark tested, go with Java because it is obviously the right tool for that job. If you want to do something heavily scalable with little to no shared state, find a benchmark for that or at least re-evaluate Erlang.
If you really need to do heavy math in Erlang, consider using HiPE (consider it anyway for that matter).
Upvotes: 15