Reputation: 595
I want to create user using ndb such as below:
def create_user(self, google_id, ....):
user_keys = UserInformation.query(UserInformation.google_id == google_id ).fetch(keys_only=True)
if user_keys: # check whether user exist.
# already created
...(SNIP)...
else:
# create new user entity.
UserInformation(
# primary key is incompletekey
google_id = google_id,
facebook_id = None,
twitter_id = None,
name =
...(SNIP)...
).put()
If this function is called twice in the sametime, two user is created.("Isolation" is not ensure between get() and put())
So, I added @ndb.transactional to above function. But following error is occured.
BadRequestError: Only ancestor queries are allowed inside transactions.
How to ensure isolation with non-ancestor query?
Upvotes: 2
Views: 96
Reputation: 55724
For your use case, perhaps you could use ndb models' get_or_insert
method, which according to the API docs:
Transactionally retrieves an existing entity or creates a new one.
So you can do:
user = UserInformation.get_or_insert(*args, **kwargs)
without risking the creation of a new user.
The complete docs:
classmethod get_or_insert(*args, **kwds)source Transactionally retrieves an existing entity or creates a new one.
Positional Args: name: Key name to retrieve or create.
Keyword Arguments
namespace – Optional namespace. app – Optional app ID.
parent – Parent entity key, if any.
context_options – ContextOptions object (not keyword args!) or None.
**kwds – Keyword arguments to pass to the constructor of the model class if an instance for the specified key name does not already exist. If an instance with the supplied key_name and parent already exists, these arguments will be discarded. Returns Existing instance of Model class with the specified key name and parent or a new one that has just been created.
Upvotes: 0
Reputation: 39824
The ndb library doesn't allow non-ancestor queries inside transactions. So if you make create_user()
transactional you get the above error because you call UserInformation.query()
inside it (without an ancestor).
If you really want to do that you'd have to place all your UserInformation
entities inside the same entity group by specifying a common ancestor and make your query an ancestor one. But that has performance implications, see Ancestor relation in datastore.
Otherwise, even if you split the function in 2, one non-transactional making the query followed by a transactional one just creating the user - which would avoid the error - you'll still be facing the datastore eventual consistency, which is actually the root cause of your problem: the result of the query may not immediately return a recently added entity because it takes some time for the index corresponding to the query to be updated. Which leads to room for creating duplicate entities for the same user. See Balancing Strong and Eventual Consistency with Google Cloud Datastore.
One possible approach would be to check later/periodically if there are duplicates and remove them (eventually merging the info inside into a single entity). And/or mark the user creation as "in progress", record the newly created entity's key and keep querying until the key appears in the query result, when you finally mark the entity creation as "done" (you might not have time to do that inside the same request).
Another approach would be (if possible) to determine an algorithm to obtain a (unique) key based on the user information and just check if an entity with such key exists instead of making a query. Key lookups are strongly consistent and can be done inside transactions, so that would solve your duplicates problem. For example you could use the google_id
as the key ID. Just an example, as that's not ideal either: you may have users without a google_id
, users may want to change their google_id
without loosing other info, etc. Maybe also track the user creation in progress in the session info to prevent repeated attempts to create the same user in the same session (but that won't help with attempts from different sessions).
Upvotes: 2