Reputation: 1
I am trying to group COVID case numbers into monthly and yearly groupings. Ex Jan 2020 had x amount of cases, Feb 2020 had x amount of cases, March 2020 had x amount of cases and etc,.
I was able to create a column that groups by month and year but I’m not getting correct case count totals. I’m using BigQuery and I’ve tried the following.
SELECT
Continent,
Location,
Format_Date( ‘%b-%Y’, date) Mon_Year,
SUM(new_cases) AS ‘case_count’
FROM
covid_cases
Where
Continent = ‘United States’
GROUP BY
Continent,
Location,
Mon_year
Dates in the original dataset are formatted ‘01-01-2020’ and are chronological through 2022 for various countries.
Columns are: continent, location, date, new_cases, total_cases, total_deaths, new_deaths.
Upvotes: 0
Views: 99
Reputation: 12264
Dates in the original dataset are formatted ‘01-01-2020’
Since date
column is not a DATE type but a string, you can't use format_date function directly. cosider to convert it to DATE type first.
SUM(new_cases) AS ‘case_count’
When you specify a column alias which need to be quoted, use backtick(`) instead of a single quote(') though it's not necessary in this case.
SELECT
Continent,
Location,
Format_Date('%b-%Y', PARSE_DATE('%d-%m-%Y', date)) Mon_Year,
SUM(new_cases) AS `case_count`
FROM
covid_cases
Where
Continent = 'United States'
GROUP BY
Continent,
Location,
Mon_year
ps. for Mon_Year
, just using RIGHT(date, 7) seems enough to me if it's format is %d-%m-%Y
.
Upvotes: 0