Ashish
Ashish

Reputation: 8529

How to ensure that cursor does not return duplicate document in MongoDB?

In MongoDB, read operation on the collection returns cursor.

If the read operation is accessing most of the documents in the collection and it may be possible that it may interleave with other update operation.

In that case, may it be possible that cursor will have duplicate documents ?

How to make sure that cursor will avoid duplicates ?

Upvotes: 3

Views: 2313

Answers (1)

Sammaye
Sammaye

Reputation: 43884

The distinct method will not be of much help here. This is not a problem that the function can solve, not only that but it is a fraction of the speed of a normal cursor.

If the read operation is accessing most of the documents in the collection and it may be possible that it may interleave with other update operation.

It is possible if the documents move in such a manner that with the sort of the cursor they get read again.

Whether this is a problem or not depends, if you are sorting by something that won't be updated, for example _id, then you don't really need to worry, however, if you are sorting by something that will be updated and could shift then yes; you will have a problem.

One method of solving this is to look at the last _id in that iteration of the cursor, filling the cursor into batchs of 1000 in an array or something. After you have the last _id in that batch you range, taking everything greater than that _id.

Another method could be to do snapshot queries: http://docs.mongodb.org/manual/reference/operator/snapshot/ however this function has quite a few limitations, for example it cannot be used with sharded collections.

Upvotes: 4

Related Questions