Junji Zhi
Junji Zhi

Reputation: 1470

Difference between Kinesis Stream and DynamoDB streams

They seem to be doing the same thing to me. Can anyone explain to me the difference?

Upvotes: 48

Views: 29734

Answers (3)

Ankur Kothari
Ankur Kothari

Reputation: 908

Kinesis stream is basically a queue where you input data and can have your consumer read data from. Just a queue which advanced features like shards/partitions.

dynamodb stream is nothing but an internal queue implemented by AWS that captures all the data changes for the table you want. The use case here is for you to use the changes at the record level and use that change to do things you want like "customer updated their preferences/address", "customer cancelled their order" you get the idea.

You just have to decide which is easier to use and maintain. Do you want to handle the queue yourself or do you want to just use dynamodb to dump your key/value and then if anything changed/inserted/deleted only then have your consumer do something.

Upvotes: -1

Leeroy Hannigan
Leeroy Hannigan

Reputation: 19783

Below is the differences in both when you wish to compare Kinesis Data Streams for DynamoDB and DynamoDB Streams.

Properties Kinesis Data Streams for DynamoDB DynamoDB Streams
Data retention Up to 1 year. 24 hours.
Kinesis Client Library (KCL) support Supports KCL versions 1.X and 2.X. Supports KCL version 1.X.
Number of consumers Up to 5 simultaneous consumers per shard, or up to 20 simultaneous consumers per shard with enhanced fan-out. Up to 2 simultaneous consumers per shard.
Throughput quotas Unlimited. Subject to throughput quotas by DynamoDB table and AWS Region.
Record delivery model Pull model over HTTP using GetRecords and with enhanced fan-out, Kinesis Data Streams pushes the records over HTTP/2 by using SubscribeToShard. Pull model over HTTP using GetRecords.
Ordering of records The timestamp attribute on each stream record can be used to identify the actual order in which changes occurred in the DynamoDB table. For each item that is modified in a DynamoDB table, the stream records appear in the same sequence as the actual modifications to the item.
Duplicate records Duplicate records might occasionally appear in the stream. No duplicate records appear in the stream.
Stream processing options Process stream records using AWS Lambda, Kinesis Data Analytics, Kinesis data firehose , or AWS Glue streaming ETL. Process stream records using AWS Lambda or DynamoDB Streams Kinesis adapter.
Durability level Availability zones to provide automatic failover without interruption. Availability zones to provide automatic failover without interruption.

Upvotes: 18

Taterhead
Taterhead

Reputation: 5951

High level difference between the two:

Kinesis Streams allows you to produce and consume large volumes of data(logs, web data, etc), whereas DynamoDB Streams is a feature local to DynamoDB that allows you to track the granular changes to your DynamoDB table items.

More details:

Amazon Kinesis Streams

enter image description here Amazon Kinesis Streams is part of Big Data suite of services at AWS. From the developer documentation:

You can use Streams for rapid and continuous data intake and aggregation. The type of data used includes IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. The following are typical scenarios for using Streams:

Accelerated log and data feed intake and processing ...

Real-time metrics and reporting ...

Real-time data analytics ...

Complex stream processing ...

DynamoDB Streams

DynamoDB Logo DynamoDB is the NoSQL option at AWS. It's delivery unit is a Table that stores Items. DynamoDB Streams is a DynamoDB feature you can turn on at Table level to record all changes to all Items (in the exact order they happened). This can then be streamed in real time as the changes happen, without perf impact. When you turn on the feature, you choose what is written to the stream:

  • Keys only—only the key attributes of the modified item (including LSIs), but no way to add anything else).
  • New image—the entire item, as it became after it was modified.
  • Old image—the entire item, as it was just before it was modified.
  • New and old images—both the new and the old images of the item

DynamoDB streams are commonly used for replication or table audits. More information can be found at the developer guide on DynamoDB streams.

The primary restrictions imposed by DynamoDB Streams are:

  • only 1 or 2 consumers (you need to use fanout patterns beyond that)
  • only 24h retention. While absolutely all changes are recorded, and in strict order, there is a hard limit on the retention - you need to grab them and do something, quick.

I can see where you might have gotten confused if you stumbled across this article first, which says that they are similar. They are different services which share similar API calls. The consumption experience is hence very similar.

Upvotes: 61

Related Questions