Reputation: 285
After reading about the GAE Datastore API, I am still unsure if I need to duplicate key names and parents as properties for an entity.
Let's say there are two kinds of entities: Employee and Division. Each employee has a division as its parent, and is identified by an account name. I use the account name as the key name for employees. But when modeling Employee, I would still keep these two as properties:
division = db.ReferenceProperty(Division)
account_name = db.StringProperty()
Obviously I have to manually keep division
consistent with its parent, and account_name
with its key name. The reasons I am doing this extra work are:
But this is really unnecessary work, wasting time and storage space. I cannot get rid of the SQL-thinking that - why doesn't Google just let us define a property to be the key? and another to be the parent? Then we could name them and use as normal properties...
What's the best practice here?
Upvotes: 2
Views: 1006
Reputation: 1525
Keep in mind that in the GAE Datastore you can never change the parent or key_name of an entity once it has been created. These values are permanent for the life of the entity.
If there is even a small chance that the account_name of an Employee could change then you can not use it as a key_name. If it never changes then it could be a very good key_name and will allow you to do cheap gets for Employees using Employee.get_by_key_name() instead of expensive queries.
Parent is not meant to be equivalent to a foreign key. A better equivalent to a foreign key is a reference property.
The main reason you use parent is so that the parent and child entities are in the same entity group which allows you to operate on them both in a single transaction. If you just need a reference to the division from the Employee then just use a reference property. I suggest getting familiar with how entity groups work as this is very important on GAE data modeling:
Using parent can also cause write performance issues as there is a limit to how quickly you can write to a single entity group (approximately one write per second). When deciding whether to use parent or a reference property you need to think about which entities need to be modified in the same transaction. In many cases you can use Cross Group (XG) transactions instead. It is all about which trade-offs you want to make.
So my suggestions are:
Upvotes: 5
Reputation: 3570
When you create a new Employee object with a Divison as a parent, it would go something like:
div = Division()
... #Complete the division properties
div.put()
emp = Employee(key_name=<account_name>, parent=div)
... #Complete the employee properties
emp.put()
Then, when you want to get a reference to the Division an Employee is part of:
div = emp.parent()
#Get the Employee account_name (which is the employees's key name):
account_name = emp.key().name()
You don't have to store a RefrenceProperty to the Division an Employee is part of since it's already done in the parent. Additionally, you can get the account_name from the Employee entity's key as needed.
To query on the key:
emp = Employee.get_by_key_name(<account_name>, parent=<division>)
#OR
div = Division.get_by_key_name(<keyname>)
#Get all employees in a division
emps = Employee.all().ancestor(div)
Upvotes: 3