Ahmet K
Ahmet K

Reputation: 813

Get trending posts in DynamoDB / AWS Ecosystem

Im trying to build my own social network / forum application, where people can add and like each others posts. Im using DynamoDB as my database with a single table. For the post liking functionality Im using a Lambda Function in combination with DynamoDB-Streams which aggregates the like attribute.

Currently Im working on a ranking mechanism for these user posts. With that I want to make sure my users can list the interesting posts in a forum in that point of time.
For that purpose, I read how reddit handles its ranking algorithm on this page.
I also read this question on Stackoverflow which is near to my, without a good answer imo.

My question is, how one would solve this problem with the help of the AWS ecosystem (Maybe even with DynamoDB and Lambda Functions alone ?)

EDIT:
My database schema looks something like this:

Partitionkey                                     Sortkey             likes       ...
----------                                       --------            ------
forum#soccer                                     01.08.19 13:15
forum#baseball                                   22.08.19 20:11
post#soccer#Do you think FC Barcelona wins?      05.08.19 10:20       203
post#soccer#Which club is your favorite ?        05.08.19 10:20       2
like#Which club is your favorite ?               John Wick
like#Which club is your favorite ?               Walter White
...

With each insert of an item which starts with like# a lambdafunction is getting triggered and updates the post entry on column likes.
My aim is to query the trendiest posts of the current time. This should be possible with the available information like the creation time and like count of the post. Currently my query is just returing the newest posts

Upvotes: 4

Views: 582

Answers (1)

Pedro Arantes
Pedro Arantes

Reputation: 5379

I'll provide a possible solution considering only DynamoDB and Lambda (and maybe AWS SQS). If it doesn't fit, we may think using other solutions as Amazon ElastiCache.


Algorithm:

  1. Your DynamoBD table will have an item with a partition key (NOTE 1) named trending#posts, only trending (it's up to you) and sort key as date or type of post (or anything you want to sort. You may want to analyze the trending over time - using sort key as date - or filter trendings by post type). Or if you don't want filters, you might use just a single value.

  2. Each like in a post will trigger a Lambda which will handle trending posts (NOTE 2).

  3. When triggered, the Lambda will receive the liked post and will perform:

    1. Read all N trending posts saved in your table.

    2. Read number of likes and post time of those posts.

    3. Perform the trending score in the current N posts and, if the liked post is different from those, in the new post too.

    4. Sort again the posts and save the N with greatest score in your table.


NOTE 1: you don't need to have the exact score over time, just the ranking. I mean, if you save the trending at 9 A.M., you don't need the correct trending at 1 P.M., just the position of the 1st, 2nd... You just need the new score when a new like occurs.

NOTE 2: I said "and maybe AWS SQS" because users may like posts at the same time and Lambda would be executed concurrently and consistency problems may happen. With AWS SQS, each like will push the event to SQS which triggers the Lambda. This way Lambdas will not be executed at the same time.

Upvotes: 4

Related Questions