Monitoring usage of capped collections

I'm using MongoDB's awesome capped collections + tailable cursors for message-passing between the different processes in my system. I have many such collections, for the different types of messages, to which documents are written at variable rates and sizes. Per collection, writing rates can vary a lot, but it should be easy to derive a typical/conservative upper bound on document sizes and rates from past/ongoing action.

In addition, I have a periodic job (once an hour) which queries the messages and archives them. It is important that all messages are archived, i.e. must not be dropped before the job gets a chance to archive them. (Archived messages are written to files.)

What I would like to do is some kind of size/rate monitoring, which would allow figuring out an upper bound on message sizes and rates, based on which I can decide on a good size for my capped collections.

I plan to set up some monitoring tool, run it for a while to gather information, and then analyse it, and decide on good sizes for my capped collections. The goal is, of course, to keep them small enough in order not to take too much memory, but big enough to make dropped-message improbable.

This is the information which I think can help:

number of messages and total size written in the last hour (average, over time)
how long does it takes to complete a "full cycle" (on average, over time)
is the collection bound by the max-bytes or the max-documents limit

What is the best way to find this information, and is there any other stat which seems relevant?

Tips about how to integrate that with Graphite/Carbon would also be great!

Upvotes: 1

Answers (2)

shx2

Reputation: 64358

Failing to find an out-of-the-box solution, and getting no answers here, here's what I ended up doing

I ended up setting up a process which:

registers to all of the message-passing capped collections in my mongodb (using a tailable-cursor query), a thread per collection.
keeps message-counters per collection X time_unit (the time unit is every 10 minutes, i.e. every 10 minutes I start a new counter, while keeping all the old ones in memory)
periodically querying the stats of the capped collections (size, number of documents, and the limits), and also keep all this data in memroy.

Then I let it run for a week, and checked its state. This way, I managed to get a very good picture of the activity during the week.

For 1., I used projection to keep it as lightweight as possible, only retrieving the ID, and extracting the timestamp from it.

The data collected in 3. was used to figure out if the collections are bound by the size-limit or the number-of-documents limit.

Upvotes: 0

erbdex

Reputation: 1909

Setup the StatsD-Graphite stack and begin by sending metrics to it.
The information that you want to send can be sent by any language that can send a message over UDP.
There are language bindings in all common languages- PHP, Python, Ruby, C++, Java etc.. so that shouldn't be a problem.
Once you do this from a technical standpoint, you can focus on the other things you'd like to measure.

Upvotes: 1

Monitoring usage of capped collections

Answers (2)

Related Questions