Reputation: 11307
I'm just playing with the Snap framework and wanted to see how it performs against other frameworks (under completely artificial circumstances).
What I have found is that my Snap application tops out at about 1500 requests/second (the app is simply snap init; snap build; ./dist/app/app
, ie. no code changes to the default app created by snap):
$ ab -n 20000 -c 500 http://127.0.0.1:8000/
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
Completed 18000 requests
Completed 20000 requests
Finished 20000 requests
Server Software: Snap/0.9.5.1
Server Hostname: 127.0.0.1
Server Port: 8000
Document Path: /
Document Length: 721 bytes
Concurrency Level: 500
Time taken for tests: 12.845 seconds
Complete requests: 20000
Failed requests: 0
Total transferred: 17140000 bytes
HTML transferred: 14420000 bytes
Requests per second: 1557.00 [#/sec] (mean)
Time per request: 321.131 [ms] (mean)
Time per request: 0.642 [ms] (mean, across all concurrent requests)
Transfer rate: 1303.07 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 44 287.6 0 3010
Processing: 6 274 153.6 317 1802
Waiting: 5 274 153.6 317 1802
Total: 20 318 346.2 317 3511
Percentage of the requests served within a certain time (ms)
50% 317
66% 325
75% 334
80% 341
90% 352
95% 372
98% 1252
99% 2770
100% 3511 (longest request)
I then fired up a Grails application, and it seems like Tomcat (once the JVM warms up) can take a bit more load:
$ ab -n 20000 -c 500 http://127.0.0.1:8080/test-0.1/book
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
Completed 18000 requests
Completed 20000 requests
Finished 20000 requests
Server Software: Apache-Coyote/1.1
Server Hostname: 127.0.0.1
Server Port: 8080
Document Path: /test-0.1/book
Document Length: 722 bytes
Concurrency Level: 500
Time taken for tests: 4.366 seconds
Complete requests: 20000
Failed requests: 0
Total transferred: 18700000 bytes
HTML transferred: 14440000 bytes
Requests per second: 4581.15 [#/sec] (mean)
Time per request: 109.143 [ms] (mean)
Time per request: 0.218 [ms] (mean, across all concurrent requests)
Transfer rate: 4182.99 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 67 347.4 0 3010
Processing: 1 30 31.4 21 374
Waiting: 0 26 24.4 20 346
Total: 1 97 352.5 21 3325
Percentage of the requests served within a certain time (ms)
50% 21
66% 28
75% 35
80% 42
90% 84
95% 230
98% 1043
99% 1258
100% 3325 (longest request)
I'm guessing that a part of this could be the fact that Tomcat seems to reserve a lot of RAM and can keep/cache some methods. During this experiment Tomcat was using in excess of 700mb or RAM while Snap barely approached 70mb.
Questions I have:
Further experiments:
Then, as suggested by mightybyte, I started experimenting with +RTS -A4M -N4
options. The app was able to serve just over 2000 requests per second (about 25% increase).
I also removed the nested templating and served a document (same size as before) from the top level tpl
file. This increased the performance to just over 7000 requests a second. The memory usage went up to about 700MB.
Upvotes: 4
Views: 359
Reputation: 7282
The answer by jkeuhlen makes good observations relevant to your first question. As to your second question, there are definitely things you can play with to tune performance. If you look at Snap's old raw result data, you can see that we were running the application with +RTS -A4M -N4
. The -N4
option tells the GHC runtime to use 4 threads. (Note that you have to build the application with -threaded
to do this.) The -A4M
option sets the size of the garbage collector's allocation area. Our experiments showed that these two seemed to have the biggest impact on performance. But that was done a long time ago and GHC has changed a lot since then, so you probably want to play around with them and find what works best for you. This page has in-depth information about other command line options available to control GHC's runtime if you wish to do more experimentation.
A little work was done last year on updating the benchmarks. If you're interested in that, look around the different branches in the snap-benchmarks repository. It would be great to get more help on a new set of benchmarks.
Upvotes: 2
Reputation: 4517
I'm by no means an expert on the subject so I can only really answer your first question, and yes you are comparing apples and oranges (and also bananas without realizing it).
First off, it looks like you are attempting to benchmark different things, so naturally, your results will be inconsistent. One of these is the sample Snap application and the other is just "a Grails application". What exactly are each of these things doing? Are you serving pages? Handling requests? The difference in applications will explain the differences in performance.
Secondly, the difference in RAM usage also shows the difference in what these applications are doing. Haskell web frameworks are very good at handling large instances without much RAM where other frameworks, like Tomcat as you saw, will be limited in their performance with limited RAM. Try limiting both applications to 100mb and see what happens to your performance difference.
If you want to compare the different frameworks, you really need to run a standard application to do that. Snap did this with a Pong benchmark. The results of an old test (from 2011 and Snap 0.3) can be seen here. This paragraph is extremely relevant to your situation:
If you’re comparing this with our previous results you will notice that we left out Grails. We discovered that our previous results for Grails may have been too low because the JVM had not been given time to warm up. The problem is that after the JVM warms up for some reason httperf isn’t able to get any samples from which to generate a replies/sec measurement, so it outputs 0.0 replies/sec. There are also 1000 connreset errors, so we decided the Grails numbers were not reliable enough to use.
As a comparison, the Yesod blog has a Pong benchmark from around the same time that shows similar results. You can find that here. They also link to their benchmark code if you would like to try to run a more similar benchmark, it is available on Github.
Upvotes: 4