SheRey
SheRey

Reputation: 315

how to implement RATIO_TO_REPORT() in standard SQL in BigQuery?

I have a legacy SQL query that uses RATIO_TO_REPORT() -- it doesn't use open-access tables, but this is what it looks like:

SELECT
  Mutation_AA,
  Gene_name,
  CaseCount,
  RATIO_TO_REPORT(CaseCount) OVER (PARTITION BY Gene_name) AS ratio
FROM (
  SELECT
    COUNT(DISTINCT ID_tumour, 50000) AS CaseCount,
    Mutation_AA,
    Gene_name
  FROM
    [isb-cgc:COSMIC.grch38_v79]
  GROUP BY
    Mutation_AA,
    Gene_name )

I'm trying to migrate from legacy SQL to standard SQL (never having used SQL prior to using BigQuery), so tips would be much appreciated! thx

Upvotes: 6

Views: 10893

Answers (2)

evan_b
evan_b

Reputation: 1239

Or for a simpler example using one of the BigQuery public datasets:

select state, (state_count / total) as ratio
from (
  SELECT state, count(*) AS state_count, sum(count(*)) OVER() AS total
  FROM `bigquery-public-data.samples.natality` 
  GROUP by state
) s

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269953

Just directly calculate the ratio:

SELECT Mutation_AA,
       Gene_name,
       CaseCount,
       (CaseCount / SUM(CaseCount) OVER (PARTITION BY Gene_name)) AS ratio
. . .

You don't need the subquery:

SELECT Mutation_AA, Gene_name,
       COUNT(DISTINCT ID_tumour, 50000) AS CaseCount,
       COUNT(DISTINCT ID_tumour, 50000) / SUM(COUNT(DISTINCT ID_tumour, 50000)) OVER (PARTITION BY Gene_Name) as ratio
FROM [isb-cgc:COSMIC.grch38_v79]
GROUP BY Mutation_AA, Gene_name ;

Upvotes: 7

Related Questions