Reputation: 8614
In MongoDb's documentation about tailable cursors it states the following:
If your query is on an indexed field, do not use tailable cursors, but instead, use a regular cursor. Keep track of the last value of the indexed field returned by the query. To retrieve the newly added documents, query the collection again using the last value of the indexed field in the query criteria
I'm setting up a query to find all documents after a specific point in time, and then to keep returning documents as they are inserted. I imagine the easiest way of doing this is to query on the _id (provided we're using ObjectIds, which we are) for anything $gt the time I want.
Since _id is indexed by default, how bad is it to continually poll MongoDb with the last _id I got and keep asking for things $gt it? I realize that this would only be within 1 second precision or so, since ObjectIds only store seconds since epoch, but I can live with that, so I assume I'd be querying at least once per second.
I guess I'm just surprised that the documentation recommends the approach of querying (presumably, continually in my case) versus keeping a tailable cursor open: I would have thought that push would be cheaper than pull?
Upvotes: 7
Views: 2135
Reputation: 1105
The answers offered already are great and to the point. However, when I first read your question and problem, and perhaps I do not understand fully what exactly you are trying to do, it sounds to me like this problem/solution was built for Redis. It would be rather a simple matter of setting the cache to get/receive the information, you could access it, and remove the info when needed from the cache.
Also the amount of read/writes and certainly other operations on the DB would remain sane, as you would be polling the cache.
Again, maybe I did not understand the problem correctly, but setting up Redis correctly and using it seems the way to go in such a situation. Sounds like it was made for a cache answer.
Upvotes: 0
Reputation: 28380
What it sounds like you are wanting is to be notified of new/updated/deleted objects in the DB. This is not possible with mongodb without a little trickery. I'm guessing you've read about reading oplogs using tailable cursors, and polling is always an absolute last resort. I've never tried those as they seem a little limiting (can't use them on shared db environments) and unreliable - not to mention difficult to set up (requires replica sets) and prone to change any time in the future without warning. For example, a once popular mongo-watch library is no longer maintained in leu of better alternatives).
DB "mutation events" are implemented some DB's: Postgres implements triggers and RethinkDB actually pushes changes out to you. If you can switch to something like RethinkDB - that would be ideal.
If not, my best advice to you is to put a service layer in front of your db through which all traffic must pass. Client applications can connect to these services via sockets (which is trivial using socket.io - implemented in nearly every language). Any time your service layer processes an update, insert, or delete, you can emit those events to anybody currently connected.
Constraints with this approach
Caveats with this approach
Pros for using this approach
Upvotes: 2
Reputation: 16335
If you go with tailable cursors, there are a few issues which I can think of :
In addition to above, there are a few more caveats with using a tailable cursor given the fact they work only for capped collections.
$set
operations will work, and no $push
or $pushAll
will).remove()
documents from a capped collectionhow bad is it to continually poll MongoDb with the last _id I got and keep asking for things $gt it?
IMO, polling does introduce a latency and unnecessary busy-wait even when there are no updates but you have a lot under your control.
Performance wise, there shouldn't be an issue as long as you are using an indexed field for querying.
Upvotes: 2
Reputation: 9497
There's a big caveat here that I think that you might have overlooked. Tailable cursors only work for capped collections. Using a capped collection is probably not a general purpose solution, it's going to require careful planning to ensure that you size the capped collection appropriately to account for your data size and growth.
Upvotes: 3