Does Redshift use the same key distribution when two tables have the same distribution keys?

Question

I have several tables that contain the field customer_id.

There is not a lot of customer_ids but the underlying data is big (100s Gb per customer id).

All my queries always use this customer_id one way or another: join, aggregate or filter.

Consequently, this field seems to be the best candidate for the distributing key.

Question: If I set the same DISTRIBUTION KEY(customer_id) on all my tables, will redshift know that I want data for a specific customer on the same node for all these tables? If yes, how does it decide of this? simply by using the column name being similar over all these tables? This seems weird to me but I couldn't find anything on the topic.

Does Redshift use the same key distribution when two tables have the same distribution keys?

Answers (1)

Related Questions