Reputation: 315
I have a legacy SQL query that uses RATIO_TO_REPORT() -- it doesn't use open-access tables, but this is what it looks like:
SELECT
Mutation_AA,
Gene_name,
CaseCount,
RATIO_TO_REPORT(CaseCount) OVER (PARTITION BY Gene_name) AS ratio
FROM (
SELECT
COUNT(DISTINCT ID_tumour, 50000) AS CaseCount,
Mutation_AA,
Gene_name
FROM
[isb-cgc:COSMIC.grch38_v79]
GROUP BY
Mutation_AA,
Gene_name )
I'm trying to migrate from legacy SQL to standard SQL (never having used SQL prior to using BigQuery), so tips would be much appreciated! thx
Upvotes: 6
Views: 10893
Reputation: 1239
Or for a simpler example using one of the BigQuery public datasets:
select state, (state_count / total) as ratio
from (
SELECT state, count(*) AS state_count, sum(count(*)) OVER() AS total
FROM `bigquery-public-data.samples.natality`
GROUP by state
) s
Upvotes: 0
Reputation: 1269953
Just directly calculate the ratio:
SELECT Mutation_AA,
Gene_name,
CaseCount,
(CaseCount / SUM(CaseCount) OVER (PARTITION BY Gene_name)) AS ratio
. . .
You don't need the subquery:
SELECT Mutation_AA, Gene_name,
COUNT(DISTINCT ID_tumour, 50000) AS CaseCount,
COUNT(DISTINCT ID_tumour, 50000) / SUM(COUNT(DISTINCT ID_tumour, 50000)) OVER (PARTITION BY Gene_Name) as ratio
FROM [isb-cgc:COSMIC.grch38_v79]
GROUP BY Mutation_AA, Gene_name ;
Upvotes: 7