Reputation: 2076
I'm doing a comparison of AWS Kinesis Analytics to PipelineDB use of "reference" data in STREAM SQL.
http://docs.aws.amazon.com/kinesisanalytics/latest/dev/limits.html http://docs.pipelinedb.com/joins.html#joins
Question 1: JOIN on multiple reference tables
AWS Kinesis Analytics - only lets you join to reference data from one source. That seems really restrictive! Unless I am not understanding it. I'd want to be able to JOIN on say, USERS, and an ADDRESS reference data. I can't?
PipelineDB - says it supports JOINs, but the docs don't have JOIN examples to multiple reference tables. Does PipelineDB support joining multiple reference tables in it's STREAMS and/or CONTINUOUS VIEWs?
Question 2: Refreshing reference data
AWS Kinesis Analytics - says you have to jump through some hoops (e.g. calling AWS APIs, etc.) to refresh reference data stored in its S3 bucket for the stream
PipelineDB - Can streams simply get the latest reference data as it is updated using standard SQL updates to the reference tables?
Can PipelineDB JOIN to regular SQL VIEWs, so, in essence the SQL VIEW is updated automatically each time the underlying data is changed?
Upvotes: 0
Views: 338
Reputation: 178
PipelineDB allows you to JOIN
on as many tables as you'd like, including with other continuous views or regular views. The only thing you can't JOIN
with a stream is another stream (no stream-stream JOINs
).
Whatever "reference data" exists at JOIN
time is what will be used to update the continuous view. In other words, updating reference data after the fact will not automatically change historic data in the continuous view, but new incoming rows will reflect the updated reference data.
Here's an example of a continuous view definition which contains multiple JOINs
:
https://github.com/pipelinedb/pipelinedb/blob/master/src/test/regress/sql/stream_table_join.sql#L61
Upvotes: 0