Bogey
Bogey

Reputation: 5734

Tracking WebSocket connections in AWS DynamoDB while minimizing strongly consistent reads

I'm working on an app that requires clients to subscribe to some rows of a "Data" DynamoDB table. Clients should receive an initial snapshot, and streaming updates through a WebSocket connection.

What is the most efficient way to do so? Or, more precisely...

My current plan is to

  1. Listen to the "Data" table's change stream with some lambda
  2. Have this lambda forward the event to a SNS FIFO queue. Another lambda processes this queue by...
  3. Querying interested WebSocket subscribers from some DynamoDb "Subscribers" table (similar to this example)
  4. Push the update out to those WebSocket connections

When a subscriber comes in, I plan to

  1. Add their WebSocket connection ID to the "Subscribers" table (so they should receive delta updates from that point on), and then afterwards
  2. Query a current snapshot of the data, and push that out to the subscriber

Now of course a client might thereby receive a delta update before it receives the snapshot, but that's not an issue in my case (data is versioned and those conflicts can be managed by the client).

My concern is that by default, step 3 - querying current subscribers - would need to be a strongly consistent database read, otherwise a subscriber might miss out on an update (eg: We send out an initial snapshot. An update comes in, but due to eventual consistency, step 3 doesn't see the new subscriber yet - so they miss out!)

That kind of sucks, because we likely need to query subscribers quite often (every time an update occurs), and having to do consistent reads will slow things down - and make them more expensive from a billing perspective!

Are there any options to improve this?

Ideally, I'd like to insert a step after step 5 (and before step 6) that is "wait until the data has been pushed out to all replicas, so all weak reads after this will pick up the new subscriber". But I don't think that is possible - please do correct me if wrong.

Otherwise, I'm considering adding a timestamp to the Subscribers table. Step 3 could then lodge two separate queries - a weakly consistent read for Subscribers where Timestamp <= now - 10 minutes, and a strongly consistent read for Subscribers where Timestamp > now - 10 minutes. That kind of implies it'd be safe to assume a subscriber that came in longer than 10 minutes ago should now have propagated to all nodes, and every weakly consistent read "should" know about them by now. I don't need to say: This feels VERY dodgy!

I'd be keen to hear better ideas, or thoguhts on how bad my dodgy idea really is.

Upvotes: 0

Views: 533

Answers (1)

hunterhacker
hunterhacker

Reputation: 7132

The time for eventual consistency within the base table is usually counted in single digit milliseconds, maybe up to a couple seconds in the event of something like leader node failure where a new leader must be elected. So wait three seconds before doing your EC scan and you should be comfortable that there were no changes from before the stream listening that your client would miss.

If missing something would truly be catastrophic and you need to protect against the super rare situation where a short pause isn't a sufficient guarantee, then just do strongly consistent reads. That's what they're there for.

Upvotes: 1

Related Questions