Reputation: 579
I want to get the cases per million where a table contains "Province Population" and another table "Interim" contains a column that has unique Epid_ID values. I want to divide the count of these "Epid_ID" values with the population where the data type of population is big numeric and I am using Big Query. Following is the sql query I have written.
SELECT
Admin_Level_1_province,
count(Epid_ID) as cumulative_numbers,
(count(Epid_ID)/ppopn.Population) as case_per_million
FROM `interim-data.casedata.Interim` as Inte join `interim-data.casedata.Province Population`
as ppopn
ON Inte.Admin_Level_1_province = ppopn.Province
GROUP BY Admin_Level_1_province , case_per_million
The error I am getting is "Column case_per_million contains an aggregation function, which is not allowed in GROUP BY at [7:35]"
Following is my sample data for Interim:
Admin_Level_1_province Week_number YEAR Epid_ID
Province 1 21 2020 COV-NEP-PR5-RUP-20-00022
Province 2 21 2021 COV-NEP-PR5-RUP-20-00023
Following is my sample data for Province Population Table
Province Population
Province 1 23456778
Province 2 12345566
Upvotes: 0
Views: 198
Reputation: 172944
Try below instead
SELECT Admin_Level_1_province,
count(Epid_ID) as cumulative_numbers,
1000000 * (count(Epid_ID)/any_value(ppopn.Population)) as case_per_million
FROM `interim-data.casedata.Interim` as Inte
join `interim-data.casedata.Province_Population` as ppopn
ON Inte.Admin_Level_1_province = ppopn.Province
GROUP BY Admin_Level_1_province
Upvotes: 1
Reputation: 1269445
You don't aggregate by aggregation functions such as COUNT()
. I suspect you intend:
SELECT Admin_Level_1_province,
count(Epid_ID) as cumulative_numbers,
(count(Epid_ID)/ppopn.Population) as case_per_million
FROM `interim-data.casedata.Interim` Inte join
`interim-data.casedata.Province Population` ppopn
ON Inte.Admin_Level_1_province = ppopn.Province
GROUP BY Admin_Level_1_province, ppopn.Population ;
Upvotes: 1