Jimmy Kane
Jimmy Kane

Reputation: 16825

Dereference Models from many to many relationship

In my schema, as described in the below test data generation example, I want to know a good way to:

Dereference all instances of Favourites that have reference keys to instances of Pictures that have been deleted. Just delete any Favourite that links to a deleted picture.

Why this question? First I hope it doesn't fall out of the scope here, second because this can happen and third because it's interesting.

How? Let's say that a person can have up to thousands favourites, something like Likes are on social networks or to make it worse, orders, accounts or invalid data in a scientific application. In our example for some reason (and these reasons happen) a person is experiencing lot of dead favourite link, or I do know, that there are dead favourites.

What would be a good way to do this, reducing ndb.get() operations and not iterating through every Favourite.

Lets not complicate things. Lets make the assumption that we have only one user suffering from dead favourites. He has a class of Person and stubbed user_id property of '123'.

In the following example you can use the following handlers and their corresponding functions.

import time
import sys
import logging
import random
import cgi
import webapp2

from google.appengine.ext import ndb


class Person(ndb.Expando):
    pass

class Picture(ndb.Expando):
    pass

class Favourite(ndb.Expando):
    user_id = ndb.StringProperty(required=True)
    #picture = ndb.KeyProperty(kind=Picture, required=True)
    pass

class GenerateDataHandler(webapp2.RequestHandler):

    def get(self):
        try:
            number_of_models = abs(int(cgi.escape(self.request.get('n'))))
        except:
            number_of_models = 10
            logging.info("GET ?n=parameter not defined. Using default.")
            pass
        user_id = '123' #stub
        person = Person.query().filter(ndb.GenericProperty('user_id') == user_id).get()
        if not person:
            person  = Person()
            person.user_id = user_id #Stub
            person.put()
            logging.info("Created Person instance")
        if not self._gen_data(person, number_of_models):
            return
        self.response.write("Data generated successfully")

    def _gen_data(self, person, number_of_models):
        first, last = Picture.allocate_ids(number_of_models)
        picture_keys = [ndb.Key(Picture, id) for id in range(first, last+1)]
        pictures = [] 
        favourites = []
        for picture_key in picture_keys:
            picture = Picture(key=picture_key)
            pictures.append(picture)
            favourite = Favourite(parent=person.key,
                            user_id=person.user_id, 
                            picture=picture_key
                        )
            favourites.append(favourite)
        entities = favourites
        entities[1:1] = pictures
        ndb.put_multi(entities)
        return True

class CorruptDataHandler(webapp2.RequestHandler):

    def get(self):
        if not self._corrupt_data(0.5):#50% corruption
            return
        self.response.write("Data corruption completed successfully")

    def _corrupt_data(self, n):
        picture_keys = Picture.query().fetch(99999, keys_only=True)
        random_picture_keys = random.sample(picture_keys, int(float(len(picture_keys))*n))
        ndb.delete_multi(random_picture_keys)
        return True

class FixDataHandler(webapp2.RequestHandler):

    def get(self):
        user_id = '123' #stub
        person = Person.query().filter(ndb.GenericProperty('user_id') == user_id).get()
        self._dereference(person)

    def _dereference(self, person):
    #Here if where you implement your answer

Separate handlers due to eventual consistency in the NDB Datastore. More info: GAE put_multi() entities using backend NDB

Of course I am posting an answer as well to show that I tried something before posting this.

Upvotes: 0

Views: 377

Answers (2)

dragonx
dragonx

Reputation: 15143

A ReferenceProperty is just a key, so if you have the key of the deleted Person, you can use that to query the Favourite.

Otherwise, there's no easy way. You'll have to filter through all Favourites and find ones that have an invalid Picture. It's very simple in a mapreduce job, but could be an expensive query if you have a lot of Favourites.

Upvotes: 1

nizz
nizz

Reputation: 1133

You could use a pre delete hook (look here for a way to implement it) Of course this could be done easier if you use the NDB API instead of the Datastore API (hooks on NDB), but then you'll have to change the way you make the referenes

Upvotes: 1

Related Questions