Reputation: 102
Trying to convert Teradata bteq SQL scripts to redshift SQL. My current redshift Postgres version is 8.0.2, redshift version is 1.0.1499. The current version of redshift does not support rollup(), grouping() functions. How to overcome and resolve this scenario. What are the equivalent redshift functions for them? Could anyone explain with some examples how to do?
Sample Teradata SQL-
select
PRODUCT_ID,CUST_ID,
GROUPING (PRODUCT_ID),
GROUPING (CUST_ID),
row_number over (order by PRODUCT_ID,CUST_ID) AS "ROW_OUTPUT_NUM"
from products
group by rollup(PRODUCT_ID,CUST_ID);
Need to convert above sql query to Redshift
Upvotes: 3
Views: 8994
Reputation: 13437
Once Redshift does not currently recognize the ROLLUP clause, you must implement this grouping technique in a hard way.
With ROLLUP Ex. PostgreSQL
SELECT column1, aggregate_function(*)
FROM some_table
GROUP BY ROLLUP(column1)
The equivalent implementation
-- First, the same GROUP BY without the ROLLUP
-- For efficiency, we will reuse this table
DROP TABLE IF EXISTS tmp_totals;
CREATE TEMP TABLE tmp_totals AS
SELECT column1, aggregate_function(*) AS total1
FROM some_table
GROUP BY column1;
-- Show the table 'tmp_totals'
SELECT * FROM tmp_totals
UNION ALL
-- The aggregation of 'tmp_totals'
SELECT null, aggregate_function(total1) FROM tmp_totals
ORDER BY 1
Example output
Country | Sales
-------- | -----
Poland | 2
Portugal | 4
Ukraine | 3
null | 9
With ROLLUP Ex. PostgreSQL
SELECT column1, column2, aggregate_function(*)
FROM some_table
GROUP BY ROLLUP(column1, column2);
The equivalent implementation
-- First, the same GROUP BY without the ROLLUP
-- For efficiency, we will reuse this table
DROP TABLE IF EXISTS tmp_totals;
CREATE TEMP TABLE tmp_totals AS
SELECT column1, column2, aggregate_function(*) AS total1
FROM some_table
GROUP BY column1, column2;
-- Show the table 'tmp_totals'
SELECT * FROM tmp_totals
UNION ALL
-- The sub-totals of the first category
SELECT column1, null, sum(total1) FROM tmp_totals GROUP BY column1
UNION ALL
-- The full aggregation of 'tmp_totals'
SELECT null, null, sum(total1) FROM tmp_totals
ORDER BY 1, 2;
Example output
Country | Segment | Sales
-------- | -------- | -----
Poland | Premium | 0
Poland | Base | 2
Poland | null | 2 <- sub total
Portugal | Premium | 1
Portugal | Base | 3
Portugal | null | 4 <- sub total
Ukraine | Premium | 1
Ukraine | Base | 2
Ukraine | null | 3 <- sub total
null | null | 9 <- grand total
Upvotes: 2
Reputation: 5192
If you use the UNION technique that others have pointed to, you'll be scanning the underlying table multiple times.
If the fine-level GROUPing actually results in a significant reduction in the data size, a better solution may be:
create temp table summ1
as
select PRODUCT_ID,CUST_ID, ...
from products
group by PRODUCT_ID,CUST_ID;
create temp table summ2
as
select PRODUCT_ID,cast(NULL as INT) AS CUST_ID, ...
from products
group by PRODUCT_ID;
select * from summ1
union all
select * from summ2
union all
select cast(NULL as INT) AS PRODUCT_ID, cast(NULL as INT) AS CUST_ID, ...
from summ2
Upvotes: 1