user3848394
user3848394

Reputation: 67

Google Cloud Datastore realworld example od model in python

I’m new on Google Datasore and Python but I have a project on it. Even with the great Google documentation I miss a realworld example of modeling data. So here a part of my project, and a proposition of modelisation and some questions about it… I’m sure you can help me to understand more clearly Datastore and I think this questions can help some beginners like me to see how to model our data to have a great application !

A soccer feed contain some general informations about the match itself such as : the name of the competition it belongs, the pool name, the season, the matchday, the winner team.

For each team, the winner and the looser, we have the detail of the actions occurred during the match : cards and goals. For the cards, we have theses informations : the color, the period it occurred, a player id, the reason, the time it occurred. For the Goals, we have the period, a player id, the time, a player assistant id.

We have also the detail for each team of : the player name, their position (center, middle…), and date of birth.

Here the model I would like to use to ingest data from the soccer feed into the Datastore using python :

I have some entities : Team, Player, Match, TeamData, Card and Goal. For each match, we will have two TeamData one for each team and the detail of action (cards and goal) I used Key Property between TeamData and Match and between Card/Goal and TeamData but I think I could use parent relationship, I don’t know what is the best.

class Team(ndb.Model):
name = ndb.StringProperty()

class Player(ndb.Model):
teamKey = ndb.KeyProperty(Kind=Team)
name = ndb.StringProperty()
date_of_birth
position = ndb.StringProperty()

class Match(ndb.Model):
name_compet = ndb.StringProperty() 
round = ndb.StringProperty()
season
matchday
team1Key = ndb.KeyProperty(Kind=Team)
team2Key = ndb.KeyProperty(Kind=Team)
winning_teamKey = ndb.KeyProperty(Kind=Team)

class TeamData(ndb.Model):
match = ndb.ReferenceProperty(Match, collection_name=’teamdata’)
score
side(away or home) = ndb.StringProperty()
teamKey = ndb.KeyProperty(Kind=Team)

class Card(ndb.Model):
teamdata = ndb.ReferenceProperty(TeamData, collection_name=’card’)
playerKey = ndb.KeyProperty(Kind=Player)
color = ndb.StringProperty()
period = ndb.StringProperty()
reason = ndb.StringProperty()
time
timestamp

class Goal((ndb.Model):
teamdata = ndb.ReferenceProperty(TeamData, collection_name=’goal’)
period = ndb.StringProperty(Kind=Player)
playerkey = ndb.KeyProperty(Kind=Player)
time = ndb.StringProperty()
type = ndb.StringProperty()
assistantplayerKey = ndb.KeyProperty(Kind=Player)

Here my questions :

Is this modelisation “correct” and allows basic queries (which team played on a certain day, what are the result with detail of cards and goal (player, assistant, reason, time) for a certain match)

and more complexe queries (how many goals does a certain player made for a certain season) ?

I don’t really see the difference between an SQL database and a NoSQL database such as DataStore except that the datastore deals with the keys and not us. Can you explain me clearly what advantage I have with this NoSQL modelisation ?

Thank you for helping me !

Upvotes: 2

Views: 384

Answers (1)

Patrice
Patrice

Reputation: 4692

The NoSQL makes it WAY faster, and not dependent on size of data scanned. For a 3 Terabytes table in SQL, no matter what you return, it'll take the same "computation time" server side. In Datastore, since it DIRECTLY scans where you need, the size of the RETURNED rows/columns actually dictate the time it will take.

On the other hand, it takes a bit more time to save (since it needs to save to multiple indexes), and it CANNOT do server-side computations. For instance, with the datastore, you can't SUM or AVERAGE. The datastore ONLY scans and returns, that's why it's so fast. It was never intended to do calculations on your behalf (so the answer to "can it do more complex queries?" is no. But that's not your model, that's the datastore). One thing that could help to do these kinda sums is to keep a counter in a different entity and update it as needed (have another entity "totalGoals" with "keyOfPlayer" and "numberOfGoals")

One thing worth mentioning is about eventual consistency. In SQL, when you "insert", the data is in the table and can be retrieved immediately. In the Datastore, consistency is not immediate (because it needs to copy to different indexes, you can't know WHEN the insert is completely done). There are ways to force consistency. Ancestor queries is one of them, as is querying directly by key, or opening your datastore viewer.

Another thing, even if it won't touch you (in the same idea of "providing a question for other beginners, I try to include as much as I can think of) is that ancestor queries, to make them safe, actually FREEZE the entity group they are using (entity group = parents + childs + childs of childs + etc) when you query one.

Other questions? Refer to docs about entities, indexes, queries, and modeling for strong consistencies. Or feel free to ask, I'll edit my answer in consequence :)

Upvotes: 1

Related Questions