Reputation: 1022

Query against two separate Redshift clusters in a single query?

I am designing my db structure, and wondering if possible to run a single query against two separate Redshift clusters?

If possible, any limitation on the region, availability zones, VPC groups, etc.?

Upvotes: 0

Answers (3)

user433342

Reputation: 1048

This can now be done with redshift datashares

Upvotes: 0

Joe Harris

Reputation: 14035

No, it's no possible in Redshift directly. ~~Additionally you cannot query across multiple databases on the same cluster.~~

UPDATE: Redshift announced a preview for cross database queries on 2020-10-15 - https://docs.aws.amazon.com/redshift/latest/dg/cross-database-overview.html

You could use an external tool such as Amazon Athena or Presto running on an EMR cluster to do this. You would define each Redshift cluster as an external data source. Be careful though, you will lose most of Redshift's performance optimizations and a lot of data will have to be pulled back into Athena / Presto to answer your queries.

As an alternative to cross-cluster queries, consider placing your data onto S3 in well partitioned Parquet or ORC files and using Redshift Spectrum (or Amazon Athena) to query them. This approach allows multiple clusters to query a common data set while maintaining good query performance. https://aws.amazon.com/blogs/big-data/10-best-practices-for-amazon-redshift-spectrum/

Upvotes: 2

Eralper

Reputation: 6612

Using federated queries in Amazon Redshift, a second cluster tables can be accessed as an external schema

You can refer to documentation https://docs.aws.amazon.com/redshift/latest/dg/federated_query_example.html

Upvotes: 0

Query against two separate Redshift clusters in a single query?

Answers (3)

Related Questions