socrates
socrates

Reputation: 347

Ordering a sequence of writes to MongoDB v4.0 / DocumentDB

Problem

I need to establish write consistency for a sequence of queries using updateMany, against a DocumentDB cluster with only a single primary instance. I am not sure which approach to use, between Transactions, ordered BulkWrites, or simply setting a Majority write concern for each updateMany query.

Environment

AWS DocumentDB cluster, which maps to MongoDB v4.0, via pymongo 3.12.0 .

Note: the cluster has a single primary instance, and no other instances. In practice, AWS will have us connect to the cluster in replica set mode. I am not sure whether this means we need to still think about this problem in terms of replica sets.

Description

I have a sequence of documents D , each of which is an array of records. Each record is of the form {field: MyField, from_id: A, to_id: B}.

To process a record, I need to look in my DB for all fields MyField that have value A, and then set that value to B. The actual query I use to do this is updateMany. The code looks something like:

for doc in Documents:
  for record in doc: 
    doWriteUpdate(record)

def doWriteUpdate(record):
  query = ... # format the query based on record's information
  db.updateMany(query)

I need the update operations to happen such that the writes have actually been applied, and are visible, before the next doWriteUpdate query runs.

This is because I expect to encounter a situation where I can have a record {field: MyField, from_id: A, to_id: B}, and then a subsequent record (whether in the same document, or a following document) {field: MyField, from_id: B, to_id: C}. Being able to properly apply the latter record operation, depends on the former record operation having been committed to the database.

Possible Approaches

Transactions

I have tried wrapping my updateMany operation in a Transaction. If this had worked, I would have called it a day; but I exceed the size allowed: Total size of all transaction operations must be less than 33554432. Without rewriting the queries, this cannot be worked around, because the updateMany has several layers of array-filtering, and digs through a lot of documents. I am not even sure if transactions are appropriate in this case, because I am not using any replica sets, and they seem to be intended for ACID with regard to replication.

Ordered Bulk Writes

BulkWrite.updateMany would appear to guarantee execution order of a sequence of writes. So, one approach could be, to generate the update query strings for each record r in a document D, and then send those through (preserving order) as a BulkWrite. While this would seem to "preserve order" of execution, I don't know if a) the preservation of execution order, also guarantees write consistency (everything executed serially is applied serially), and, more important, b) whether the following BulkWrites, for the other documents, will interleave with this one.

WriteConcern

Pymongo states that writes will block given a desired WriteConcern. My session is single-threaded, so this should give the desired behavior. However, MongoDB says

For multi-document transactions, you set the write concern at the transaction level, not at the individual operation level. Do not explicitly set the write concern for individual write operations in a transaction.

I am not clear on whether this pertains to "transactions" as in the general sense, or MongoDB Transactions set up through session objects. If it means the latter, then it shouldn't apply to my use case. If the former, then I don't know what other approach to use.

Upvotes: 0

Views: 484

Answers (1)

Ilan Toren
Ilan Toren

Reputation: 60

The proper write concern is majority, and with a read concern that uses the linearizable

Real Time Order Combined with "majority" write concern, "linearizable" read concern enables multiple threads to perform reads and writes on a single document as if a single thread performed these operations in real time; that is, the corresponding schedule for these reads and writes is considered linearizable.

Upvotes: 1

Related Questions