Reputation: 373
I was searching for best practices to create partitions by date, using amazon-redshift-spectrum
, but the examples shows the problem being solved by partitioning the table by one date only. What to do if I have more than one date field?
Eg: Mobile events with user_install_date
and event_date
How performative is to partition your s3
like:
installdate=2015-01-01/eventdate=2017-01-01
installdate=2015-01-01/eventdate=2017-01-02
installdate=2015-01-01/eventdate=2017-01-03
Will It kill my select
performance ? What is the best strategy in this case?
Upvotes: 1
Views: 1170
Reputation: 269151
If you data was partitioned in the above manner, then a query that merely had eventdate
in the WHERE clause (without installdate
) would be less efficient.
It would still need to look through every installdate
directory, but it could skip over eventdate
directories that do not match the predicate.
Put the less-used parameter second.
Upvotes: 1