Reputation:
Is there a particular reason RedShift doesn't allow for key constraints? Check out the statement below:
Uniqueness, primary key, and foreign key constraints are informational only; they are not enforced by Amazon Redshift. Nonetheless, primary keys and foreign keys are used as planning hints. and they should be declared if your ETL process or some other process in your application enforces their integrity.
Is this due to speed or something? There must be a reason here!
Upvotes: 2
Views: 2974
Reputation: 2757
I think the main reason is because checking the uniqueness is not realistic from the standpoint of the data loading performance. Since Amazon Redshift's architecture is designed to process data in parallel for scaling out, loaded data is distributed to multiple instances. Therefore, to support those constraints, it needs to check the uniqueness across instances for each row, which sounds significantly slow due to IO.
Upvotes: 2