A. Wilson
A. Wilson

Reputation: 8840

Unexpected behavior around google cloud datastore nested transactions

The business problem I have is this: A Parent entity has a Child entity descendent. The child entity has a value that needs to be unique, so a ChildLookup entity exists to enforce that uniqueness. To abstract some things away, the entity puts/deletes have been put into their own methods, and both have batch/transaction statements as part of their logic.

In Python (using this library), when the structure is like this, everything is fine:

# assuming ('Parent', 11), ('Parent', 11, 'Child', 'foo') (with value 'bar'), and ('ChildLookup-foo', 'bar') all exist

def put_parent(c):
 p2 = c.get(c.key('Parent', 22))
 if p2 is None:
  p2 = Entity(c.key('Parent', 22))
 c.put(p2)

def put_child(c):
 with c.transaction():
  e2 = Entity(c.key(*p2.key.flat_path, 'Child', 'foo'))
  e2['__val'] = 'bar'
  el2 = c.get(c.key('ChildLookup-foo', e2['__val']))
  if el2 is not None:
   raise ValueError('insert would create duplicate')
  el2 = Entity(c.key('ChildLookup-foo', 'val'))
  c.put(el2)
  c.put(e2)

c = google.cloud.datastore.Client()

with c.transaction():
 put_parent(c)
 put_child(c)

The attempt to run this will result in the correct behavior: the exception will throw, and neither of p2 or e2 will be inserted. However, I can change put_parent to look like this:

def put_parent():
 with c.transaction():  # only actual change. can also be c.batch()
  p2 = c.get(c.key('Parent', 22))
  if p2 is None:
   p2 = Entity(c.key('Parent', 22))
  c.put(p2)

When I do it this way, p2 is inserted, despite the second transaction rolling back. This is unexpected for me: I'd expect either the rollback to be limited only to the innermost transaction (or batch), or I'd expect the rollback to affect all child transactions of the outermost transaction (or batch).

Of course, in the trivial toy example above, I could just take out the inner batches and manage it from the top level. But the point of putting them into methods is that I might occasionally want to call them individually without the same guarantees from the method that calls both of them, and I would like the business of their transactionality requirements to be unimportant to the consumer of those methods. Is there a design pattern, or some Python Google Cloud Datastore library functionality, that would let me do what I'm trying to do?

EDIT:

The code in the accepted answer is the basis for the below, which I include for the curious. It ultimately produced the behavior I wanted.

from contextlib import contextmanager

@contextmanager
def use_existing_or_new_transaction(client):
    if client.current_transaction:
        yield client.current_transaction
    else:
        with client.transaction() as xact:
            yield xact


@contextmanager
def use_existing_or_new_batch(client):
    if client.current_transaction:
        yield client.current_batch
    else:
        with client.batch() as xact:
            yield xact

It's then used like

with use_existing_or_new_transaction(c) as xact:
    xact.put(something)
    xact.delete(something_else)
    # etc

Upvotes: 0

Views: 192

Answers (1)

Siva
Siva

Reputation: 1148

Have you tried c.current_transaction?

https://googleapis.dev/python/datastore/latest/client.html

The idea is, you use the

with c.transaction()

outside all of your calls and within each call, just get the current transaction and use that to do the ops. I think you shouldn't use 'with' within the functions as that will automatically commit/rollback at the end.

So, it would be something like the following.

def put_parent(c):
 txn = c.current_transaction
 p2 = txn.get(c.key('Parent', 22))
 if p2 is None:
  p2 = Entity(c.key('Parent', 22))
 txn.put(p2)

def put_child(c):
  txn = c.current_transaction
  e2 = Entity(c.key(*p2.key.flat_path, 'Child', 'foo'))
  e2['__val'] = 'bar'
  el2 = txn.get(c.key('ChildLookup-foo', e2['__val']))
  if el2 is not None:
   raise ValueError('insert would create duplicate')
  el2 = Entity(c.key('ChildLookup-foo', 'val'))
  txn.put(el2)
  txn.put(e2)

c = google.cloud.datastore.Client()

with c.transaction():
 put_parent(c)
 put_child(c)

Upvotes: 2

Related Questions