Efficiently modelling a Feed schema on Google Cloud Datastore?

Question

I'm using GCP/App Engine to build a Feed that returns posts for a given user in descending order of the post's score (a modified timestamp). Posts that are not 'seen' are returned first, followers by posts where 'seen' = true.

When a user creates a post, a Feed entity is created for each one of their followers (i.e. a fan-out inbox model)

Will my current index model result in an exploding index and/or contention on the 'score' index if many users load their feed simultaneously?

index.yaml
indexes:
- kind: "Feed"
  properties:
  - name: "seen" // Boolean
  - name: "uid" // The user this feed belongs to
  - name: "score" // Int timestamp
    direction: desc

// Other entity fields include: authorUid, postId, postType

A user's feed is fetched by:

SELECT postId FROM Feed WHERE uid = abc123 AND seen = false ORDER BY score DESC

Would I be better off prefixing the 'score' with the user id? Would this improve the performance of the score index? e.g. score="{alphanumeric user id}-{unix timestamp}"

From the docs:

You can improve performance with "sharded queries", that prepend a fixed length string to the expiration timestamp. The index is sorted on the full string, so that entities at the same timestamp will be located throughout the key range of the index. You run multiple queries in parallel to fetch results from each shard.

With just 4 entities I'm seeing 44 indexes which seems excessive.

Efficiently modelling a Feed schema on Google Cloud Datastore?

Answers (1)

Related Questions