Reputation: 102

Redshift does not support rollup(), grouping() functions

Trying to convert Teradata bteq SQL scripts to redshift SQL. My current redshift Postgres version is 8.0.2, redshift version is 1.0.1499. The current version of redshift does not support rollup(), grouping() functions. How to overcome and resolve this scenario. What are the equivalent redshift functions for them? Could anyone explain with some examples how to do?

Sample Teradata SQL-

select 
PRODUCT_ID,CUST_ID, 
GROUPING (PRODUCT_ID), 
GROUPING (CUST_ID), 
row_number over (order by PRODUCT_ID,CUST_ID) AS "ROW_OUTPUT_NUM"
from products 
group by rollup(PRODUCT_ID,CUST_ID);

Need to convert above sql query to Redshift

Upvotes: 3

Answers (2)

ePi272314

Reputation: 13447

Implement the ROLLUP by hand

Once Redshift does not currently recognize the ROLLUP clause, you must implement this grouping technique in a hard way.

ROLLUP with 1 argument

With ROLLUP Ex. PostgreSQL

SELECT column1, aggregate_function(*)
FROM some_table
GROUP BY ROLLUP(column1)

The equivalent implementation

-- First, the same GROUP BY without the ROLLUP
-- For efficiency, we will reuse this table
DROP TABLE IF EXISTS tmp_totals;
CREATE TEMP TABLE tmp_totals AS
  SELECT column1, aggregate_function(*) AS total1
  FROM some_table
  GROUP BY column1;

-- Show the table 'tmp_totals'
SELECT * FROM tmp_totals

UNION ALL

-- The aggregation of 'tmp_totals'
SELECT null, aggregate_function(total1) FROM tmp_totals

ORDER BY 1

Example output

Country  | Sales
-------- | -----
Poland   | 2
Portugal | 4
Ukraine  | 3
null     | 9

ROLLUP with 2 argument

With ROLLUP Ex. PostgreSQL

SELECT column1, column2, aggregate_function(*)
FROM some_table
GROUP BY ROLLUP(column1, column2);

The equivalent implementation

-- First, the same GROUP BY without the ROLLUP
-- For efficiency, we will reuse this table
DROP TABLE IF EXISTS tmp_totals;
CREATE TEMP TABLE tmp_totals AS
  SELECT column1, column2, aggregate_function(*) AS total1
  FROM some_table
  GROUP BY column1, column2;

-- Show the table 'tmp_totals'
SELECT * FROM tmp_totals

UNION ALL

-- The sub-totals of the first category
SELECT column1, null, sum(total1) FROM tmp_totals GROUP BY column1

UNION ALL

-- The full aggregation of 'tmp_totals'
SELECT null, null, sum(total1) FROM tmp_totals

ORDER BY 1, 2;

Example output

Country  | Segment  | Sales
-------- | -------- | -----
Poland   | Premium  | 0
Poland   | Base     | 2
Poland   | null     | 2     <- sub total
Portugal | Premium  | 1
Portugal | Base     | 3
Portugal | null     | 4     <- sub total
Ukraine  | Premium  | 1
Ukraine  | Base     | 2
Ukraine  | null     | 3     <- sub total
null     | null     | 9     <- grand total

Upvotes: 2

dsz

Reputation: 5212

If you use the UNION technique that others have pointed to, you'll be scanning the underlying table multiple times.

If the fine-level GROUPing actually results in a significant reduction in the data size, a better solution may be:

create temp table summ1 
as
select PRODUCT_ID,CUST_ID, ...
from products 
group by PRODUCT_ID,CUST_ID;

create temp table summ2
as
select PRODUCT_ID,cast(NULL as INT) AS CUST_ID, ...
from products 
group by PRODUCT_ID;

select * from summ1
union all
select * from summ2
union all
select cast(NULL as INT) AS PRODUCT_ID, cast(NULL as INT) AS CUST_ID, ...
from summ2

Upvotes: 1

Redshift does not support rollup(), grouping() functions

Answers (2)

Implement the ROLLUP by hand

ROLLUP with 1 argument

ROLLUP with 2 argument

Related Questions