Reputation: 326
I have a table with many columns, and I want to count the unique values of each column. I know that I can do
SELECT sho_01, COUNT(*) from sho GROUP BY sho_01
UNION ALL
SELECT sho_02, COUNT(*) from sho GROUP BY sho_02
UNION ALL
....
Here sho
is the table and sho_01
,.... are the individual columns. This is BigQuery by the way, so they use UNION ALL
.
Next, I want to do the same thing, but for a subset of sho
, say SELECT * FROM sho WHERE id in (1,2,3)
. Is there a way where I can create a subtable first, and then query the subtable? Something like this
SELECT * FROM (SELECT * FROM sho WHERE id IN (1,2,3)) AS t1;
SELECT sho_01, COUNT(*) from t1 GROUP BY sho_01
UNION ALL
SELECT sho_02, COUNT(*) from t1 GROUP BY sho_02
UNION ALL
....
Thanks
Upvotes: 0
Views: 776
Reputation: 172993
Below is for BigQuery Standard SQL and allows you to avoid manual typing of column names or even knowing them in advance
#standardSQL
SELECT
TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') column,
SPLIT(kv, ':')[OFFSET(1)] value,
COUNT(1) cnt
FROM `project.dataset.table` t,
UNNEST(SPLIT(TRIM(TO_JSON_STRING(t), '{}'))) kv
GROUP BY column, value
-- ORDER BY column, value
Upvotes: 1
Reputation: 1269823
Presumably, the columns are all of the same type. If so, you can simplify this using arrays:
select el.which, el.val, count(*)
from (select t1.*,
array[struct('sho_01' as which, sho_01 as val),
struct('sho_2', show_02),
. . .
] as ar
from t
) t cross join
unnest(ar) el
group by el.which, el.val;
You can then easily filter however you want by adding a where
clause before the group by
.
Upvotes: 2