Vadim Katsemba
Vadim Katsemba

Reputation: 127

Subtracting columns from separate BigQuery queries

I have two separate BigQuery queries where I took the sum of confirmed cases for a certain date and grouped them by region and ordered in descending order by cases.

SELECT region, SUM(confirmed_cases) AS total_cases FROM provincedata WHERE date BETWEEN '2020-08-01' AND '2020-08-02' GROUP BY region ORDER BY total_cases DESC
SELECT region, SUM(confirmed_cases) AS total_cases FROM provincedata WHERE date BETWEEN '2020-08-31' AND '2020-09-01' GROUP BY region ORDER BY total_cases DESC

I want to calculate the difference between total_cases in the first and second queries and group and order by region and descending order by the difference in descending order.

Upvotes: 2

Views: 1485

Answers (2)

GMB
GMB

Reputation: 222582

This might be more efficiently expressed with conditional aggregation:

select
    region,
    sum(case when date between '2020-08-01' and '2020-08-02' then confirmed_cases else 0 end) total_cases_1,
    sum(case when date between '2020-08-31' and '2020-09-02' then confirmed_cases else 0 end) total_cases_2,
    sum(case when date between '2020-08-01' and '2020-08-02' then confirmed_cases else - confirmed_cases end) diff
from provincedata
where 
    date between '2020-08-01' and '2020-08-02'
    or date between '2020-08-31' and '2020-09-01' 
group by region
order by diff desc

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

Below is for BigQuery Standard SQL

The simplest way is just to reuse your queries that you already comfortable with (instead of rewriting stuff)

#standardSQL
WITH `project.dataset.query1` AS (
  SELECT region, SUM(confirmed_cases) AS total_cases 
  FROM provincedata 
  WHERE DATE BETWEEN '2020-08-01' AND '2020-08-02' 
  GROUP BY region 
), `project.dataset.query2` AS (
  SELECT region, SUM(confirmed_cases) AS total_cases 
  FROM provincedata 
  WHERE DATE BETWEEN '2020-08-31' AND '2020-09-01' 
  GROUP BY region 
)
SELECT region, q1.total_cases - q2.total_cases AS total_cases_difference
FROM `project.dataset.query1` q1 
JOIN `project.dataset.query2` q2
USING(region)
ORDER BY total_cases_difference DESC

Upvotes: 1

Related Questions