Reputation: 2533
We're running a very time-sensitive web application (response time has to be below 100ms), with a lot of requests (about 200k requests per minute at peak). The architecture is really simple: a load balancer, several web servers, running apache and php, and a database running MySQL.
We also need to be able to generate statistics based on those requests.
About a year ago when we were serving a tenth of our current traffic volume, we developed some bash/python scripts to periodically dump the logs from mysql, trasnfer them to another server, import them again and run the statistics there, the idea being to have the production servers do as little as possible so that we could have a low response time.
As you might imagine, that solution didn't scale very well, and at the moment, the stat server is barely keeping up. We need a way to generate statistics in real time.
Do you have any experiences with this kind of a setup? Our idea at the moment is to have the web servers call the stats servers in real time on each request.
The two main problems are:
Upvotes: 1
Views: 409
Reputation: 308753
Why use a database? Calculate mean and standard deviation in memory on the fly as requests come in. You won't have any latency that way, and you can have access to values using an MBean console.
This can only work on an individual server, not a cluster.
Upvotes: 2
Reputation: 456
1) Separate MySQL server Why dont you connect directly to another MySQL server and write stats there? From the top of my head at this moment I would create table for each day so I could easily move older tables out when not needed. Problem here is lack of horizontal scalability though ...
2) NoSQL Maybe you should use MongoDB or Redis for things like this? They are much faster since they are memory based and offer sharding.
3) Independent stats server If you are serving HTML you can add javascript method to call script (and small img inside tags with url for users having javascript disabled) on remote server which can write statistics from params given in URL. This would offload everything from application servers totally and you can try suggestion #1 or #2 there ...
Upvotes: 2