Kev
Kev

Reputation: 1

Grouping COVID cases by month and year

I am trying to group COVID case numbers into monthly and yearly groupings. Ex Jan 2020 had x amount of cases, Feb 2020 had x amount of cases, March 2020 had x amount of cases and etc,.

I was able to create a column that groups by month and year but I’m not getting correct case count totals. I’m using BigQuery and I’ve tried the following.

SELECT
 Continent,
 Location,
 Format_Date( ‘%b-%Y’, date) Mon_Year,
 SUM(new_cases) AS ‘case_count’
FROM
 covid_cases
Where
 Continent = ‘United States’
GROUP BY
 Continent,
 Location,
 Mon_year

Dates in the original dataset are formatted ‘01-01-2020’ and are chronological through 2022 for various countries.

Columns are: continent, location, date, new_cases, total_cases, total_deaths, new_deaths.

Upvotes: 0

Views: 99

Answers (1)

Jaytiger
Jaytiger

Reputation: 12264

Dates in the original dataset are formatted ‘01-01-2020’

Since date column is not a DATE type but a string, you can't use format_date function directly. cosider to convert it to DATE type first.

SUM(new_cases) AS ‘case_count’

When you specify a column alias which need to be quoted, use backtick(`) instead of a single quote(') though it's not necessary in this case.

SELECT
 Continent,
 Location,
 Format_Date('%b-%Y', PARSE_DATE('%d-%m-%Y', date)) Mon_Year,
 SUM(new_cases) AS `case_count`
FROM
 covid_cases
Where
 Continent = 'United States'
GROUP BY
 Continent,
 Location,
 Mon_year

ps. for Mon_Year, just using RIGHT(date, 7) seems enough to me if it's format is %d-%m-%Y.

Upvotes: 0

Related Questions