Reputation: 64358
I'm using MongoDB's awesome capped collections + tailable cursors for message-passing between the different processes in my system. I have many such collections, for the different types of messages, to which documents are written at variable rates and sizes. Per collection, writing rates can vary a lot, but it should be easy to derive a typical/conservative upper bound on document sizes and rates from past/ongoing action.
In addition, I have a periodic job (once an hour) which queries the messages and archives them. It is important that all messages are archived, i.e. must not be dropped before the job gets a chance to archive them. (Archived messages are written to files.)
What I would like to do is some kind of size/rate monitoring, which would allow figuring out an upper bound on message sizes and rates, based on which I can decide on a good size for my capped collections.
I plan to set up some monitoring tool, run it for a while to gather information, and then analyse it, and decide on good sizes for my capped collections. The goal is, of course, to keep them small enough in order not to take too much memory, but big enough to make dropped-message improbable.
This is the information which I think can help:
What is the best way to find this information, and is there any other stat which seems relevant?
Tips about how to integrate that with Graphite/Carbon would also be great!
Upvotes: 1
Views: 187
Reputation: 64358
Failing to find an out-of-the-box solution, and getting no answers here, here's what I ended up doing
I ended up setting up a process which:
Then I let it run for a week, and checked its state. This way, I managed to get a very good picture of the activity during the week.
For 1., I used projection to keep it as lightweight as possible, only retrieving the ID, and extracting the timestamp from it.
The data collected in 3. was used to figure out if the collections are bound by the size-limit or the number-of-documents limit.
Upvotes: 0
Reputation: 1909
Setup the StatsD-Graphite stack and begin by sending metrics to it.
The information that you want to send can be sent by any language that can send a message over UDP.
There are language bindings in all common languages- PHP, Python, Ruby, C++, Java etc.. so that shouldn't be a problem.
Once you do this from a technical standpoint, you can focus on the other things you'd like to measure.
Upvotes: 1