Reputation: 5471
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
order_date DATE,
customer VARCHAR(255)
);
INSERT INTO customers
(order_date, customer)
VALUES
('2020-04-10', 'user_01'),
('2020-04-15', 'user_02'),
('2020-05-18', 'user_03'),
('2020-05-26', 'user_04'),
('2020-06-03', 'user_05'),
('2020-06-05', 'user_06'),
('2020-06-24', 'user_07');
I am extracting the users per month with this query:
SELECT
c.customer
FROM customers c
WHERE c.order_date BETWEEN '2020-06-01' AND '2020-06-30'
This query works without any issue.
Now I wanted to use the dates as variables and therefore switched to:
WITH time_variables AS
(SELECT
'2020-06-01'::date AS start_date,
'2020-06-30'::date AS end_date)
SELECT
c.customer
FROM customers c
WHERE c.order_date BETWEEN (SELECT start_date FROM time_variables) AND (SELECT end_date FROM time_variables)
However, when I run this query I get error:
Invalid operation: This type of correlated subquery pattern is not supported due to internal error;
Do you have any idea why this error happens in amazon-redshift
?
Is there any other way to use the dates as variable?
Upvotes: 0
Views: 121
Reputation: 11032
This is happening because the query you have written uses a correlated subquery which is when a SELECT in a clause of another SELECT (like the SELECTs in your WHERE clause) can (possibly) resolve to different values for each row of the outer SELECT. In your case your CTE has only one row so can only produce the same values but the query structure doesn't require this. Correlated subquery structures are not supported on Redshift b/c resolving these correlations on massive data become prohibitively expensive (and therefore slow).
To resolve change the query to use JOIN. Like this:
WITH time_variables AS
(SELECT
'2020-06-01'::date AS start_date,
'2020-06-30'::date AS end_date)
SELECT
c.customer
FROM customers c
CROSS JOIN time_variables v
WHERE c.order_date BETWEEN v.start_date AND v.end_date
This just joins the one row of the CTE to every row in customer. Now it is clear how start and end dates are combined with the data in customer. However if you are looking to put variables in your SQL there are better ways that don't include creating new tables structures.
Upvotes: 0
Reputation: 1269783
Does it work using join
?
WITH time_variables AS (
SELECT '2020-06-01'::date AS start_date, '2020-06-30'::date AS end_date
)
SELECT c.customer
FROM customers c JOIN
time_variables tv
ON c.order_date BETWEEN tv.start_date AND tv.end_date;
Upvotes: 1
Reputation: 121
To use Variable you could use DECLARE
DECLARE @StartDate DATE = Datefromparts(2020,06,01)
DECLARE @EndDate DATE= Datefromparts(2020,06,30)
SELECT
c.customer
FROM customers c
WHERE c.order_date BETWEEN @StartDate AND @EndDate
Upvotes: 0