Reputation: 3
I have a column from a Google Bigquery Table which has certain results (na, pass, warn, and error) based on criteria (c1, c2, c3, ...), for example.
data c1 c2 c3 ...
------- ------- ------- ------- -------
data1 pass na warn ...
data2 pass na pass ...
data3 error pass error ...
Now I want to get a histogram for each column of the input table, and I want to create the summarized table as follows.
criteria na pass warn error
---------- ------- ------- ------- -------
c1 0 2 0 1
c2 2 1 0 0
c3 0 1 1 1
... ... ... ... ...
Actually, there are many criteria, so I want to create the summarized table without enumerating the criteria. How can I get the output table?
Upvotes: 0
Views: 311
Reputation: 194
You can unpivot then pivot again to get the table you want as such:
input as (
select "data1" as data, "pass" as c1,"warn" as c2,"na" as c3, "pass" as c4 union all
select "data2" as data, "na" as c1,"pass" as c2,"pass" as c3, "error" as c4 union all
select "data3" as data, "warn" as c1,"pass" as c2,"na" as c3, "warn" as c4 union all
select "data4" as data, "pass" as c1,"error" as c2,"pass" as c3, "error" as c4 union all
select "data5" as data, "pass" as c1,"na" as c2,"na" as c3, "error" as c4 union all
select "data6" as data, "pass" as c1,"warn" as c2,"na" as c3, "na" as c4
),
step_unpivot as (
select * from input unpivot(results for criterias in (c1,c2,c3,c4))
)
select * from step_unpivot pivot(count(data) for results in ('pass', 'warn','na','error'))
And since you want dynamic column selection for the unpivot(), you can probably use the answer to this question for that part: Dynamic UnPivot in Bigquery
Edit: Below a version with the dynamic unpivot
DECLARE myunpivot STRING;
SET myunpivot = (
SELECT CONCAT('(', STRING_AGG( column_name, ','), ')'),
From(
SELECT column_name FROM dataset_name.INFORMATION_SCHEMA.COLUMNS
where table_name ="unpivot_pivot"
and column_name not in("data")));
EXECUTE IMMEDIATE format("""
with
step_unpivot as (
select * from dataset_name.table_name
unpivot(results for criterias in %s)
)
select * from step_unpivot pivot(count(data) for results in ('pass', 'warn','na','error'))
""", myunpivot);
Upvotes: 1
Reputation: 12274
You may consider below,
CREATE TEMP FUNCTION json_keys(input STRING) RETURNS Array<String>
LANGUAGE js AS """
return Object.keys(JSON.parse(input));
""";
CREATE TEMP FUNCTION json_values(input STRING) RETURNS Array<String>
LANGUAGE js AS """
return Object.values(JSON.parse(input));
""";
SELECT * FROM (
SELECT criteria, result
FROM (SELECT * EXCEPT(data) FROM sample_table) t,
UNNEST (json_keys(TO_JSON_STRING(t))) criteria WITH OFFSET JOIN
UNNEST (json_values(TO_JSON_STRING(t))) result WITH OFFSET USING (offset)
) PIVOT (COUNT(result) FOR result IN ('na', 'pass', 'warn', 'error'));
+----------+----+------+------+-------+
| criteria | na | pass | warn | error |
+----------+----+------+------+-------+
| c1 | 0 | 2 | 0 | 1 |
| c2 | 2 | 1 | 0 | 0 |
| c3 | 0 | 1 | 1 | 1 |
+----------+----+------+------+-------+
Upvotes: 2