Reputation: 1277
I have a condition where I need to count the number of requests coming on my HTTP server, aggregated by hour and request type. For example - This is the kind of output data that I want to obtain if I have 3 unique resource requests
Resource /a - 10 req between 10pm - 11pm, 13 req between 11pm - 12am
Resource /b - 14 req between 10pm - 11pm, 17 req between 11pm - 12am
Resource /c - 12 req between 10pm - 110m, 16 req between 11pm - 12am
There is no requirement for real time reporting. It can stand a delay of a couple of hours. I know I can achieve this by log parsing. But just wanted to know if there is a better way to store this kind of data. Lets say a real-time counter in Redis where the key is made using url + hour
and dump it periodically lets say every 2 hours to some other DB.
Upvotes: 0
Views: 1610
Reputation:
I am a Redis fan, but I wouldn't use Redis for something like this. I would instead use a Message Queue, like RabbitMQ, or even better Kafka. Just dump your request there and have a different process pick it from there and process it.
There is no reason for adding latency(even it is 1ms) to the request serving for calculating counters or doing anything that requires a response.
Upvotes: 0
Reputation: 5689
I assume you have Servlets in your application, in a high level filter apply a logic like this
hincrby(date+action, hour, 1);
date -> current date
hour -> current hour
action -> the action you want to save
if you want action a's count for overall date, do hgetall date+action for specific hour frame you can choose that alone from the map in your application logic. Do the sum and that's your result.
In this way only one hit per request will happen. Which will take 1ms for Redis. We have been using Redis for real time analytics in this way.
Upvotes: 0
Reputation: 3694
One way to store it in redis is using hash :-
hash key as date-time sample
hashkey :- "2016-04-27-10-11"
"2016-04-27-10-11" :{
"md5-request-uri-1" : "count of request",
"md5-request-uri-2" : "count of request"
}
Redis functions that you can use are :- hash incr by (HINCRBY)
HINCRBY 2016-04-27-10-11 md5-request-uri-1 1
http://redis.io/commands/HINCRBY
Now you can have an hourly cron that will parse the logs for the past hour, and will store them in redis in the above mentioned format.
To get all the resources with count you can use :- HGETALL To get the count of a particular resource you can use :- HGET
Upvotes: 2
Reputation: 3491
Log parsing or an analytics system like Google Analytics (hosted) or Piwik (self-hosted) are your best options. Don't try to track views inside your code because if you ever add a full-page cache in front it, your code won't run every time to track the hits.
Upvotes: 1