user1848018
user1848018

Reputation: 1106

How to pass a list of column names as variable in BigQuery without using "excute imemdiate"

I am using CTE query and I am coming up with many queries that do essentially the same function but only the column names are different. This is a simplfied example

WITH org_sales AS (
SELECT    'team_a' AS teamId, 'CA' AS state, 12.1 AS sales, 'code1' AS jobCode
UNION ALL    SELECT        'team_a','IL',13.1, 'code1'
UNION ALL    SELECT    'team_a','CT',5.1,'code1'
UNION ALL    SELECT        'team_a','FL',14.1,'code11'
UNION ALL    SELECT        'team_a','MA',10.3,'code11'
UNION ALL    SELECT        'team_b','AZ',8.1,'code2'
UNION ALL    SELECT    'team_b','MO',6.2,'code2'
UNION ALL    SELECT        'team_b','CA',7.3,'code2'
),
team_state_sales AS(
    SELECT 
        teamId,state,AVG(sales) sales_avg from org_sales
    GROUP BY
        teamId,state
),
team_jobCode_sales AS(
    SELECT 
        teamId,jobCode,AVG(sales) sales_avg from org_sales 
    GROUP BY
        teamId,jobCode
)
SELECT teamId,state,jobCode from org_sales

teamId  state   jobCode sales
team_a    CA    code1   12.1
team_a    IL    code1   13.1
team_a    CT    code1   5.1
team_a    FL    code11  14.1
team_a    MA    code11  10.3
team_b    AZ    code2   8.1
team_b    MO    code2   6.2
team_b    CA    code2   7.3

I was hoping to replace ['team_jobCode_sales','team_state_sales',...] with a general query that takes the column list for GROUP BY and SELECT as variabe (string) using CREATE OR REPLACE TABLE FUNCTION together with execute immediate however it does not work with CTE.

The psuedocode is something like that

CREATE OR REPLACE TABLE FUNCTION sales_avg(select_cols STRING,groupby_cols STRING)
AS
  SELECT select_cols,
          AVG(sales) sales_avg 
  FROM  
           org_sales
  GROUP BY
           group_cols

The only way I know that could be done is using execute immediate, is there any other way to do this?

Upvotes: 0

Views: 1418

Answers (1)

Pratik Patil
Pratik Patil

Reputation: 832

I feel what you are looking for is a stored procedure: https://cloud.google.com/bigquery/docs/procedures

It allows you to create programmatic way to take in variables and allow to plug into execute immediate statements easily.

Something like:

CREATE OR REPLACE PROCEDURE mydataset.your_proc_name(select_cols STRING, group_by_cols STRING)

BEGIN 
....
           EXECUTE IMMEDIATE CONCAT("your select query ",select_cols," from .... where ... group by ",group_by_cols );
...
END

You can do pretty advance stuff with stored procedures in bigquery so definitely worth looking at.

Upvotes: 0

Related Questions