Reputation: 109
In bigquery using legacy sql I have created a monstrous query that returns the following display of visits per day for a site that I released 2018-02-26:
Row date name release_date visits_count
1 20180226 a_name 20180226 2179
2 20180227 a_name 20180226 9522
3 20180228 a_name 20180226 1593
4 20180301 a_name 20180226 300
...
What I really want is
Row name release count_release count_release+1 count_release_rest
1 a_name 20180226 2179 9522 1893
Thus, I want the actual visit count for release date, the day after the release date and all counts after that should just be summed. I'm new to bigquery (and kind of new to sql...). Is there a way to define my first display as a "subtable" or something like that so that I can do this or what approach would you recommend?
Upvotes: 0
Views: 777
Reputation: 173028
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT '20180226' date, 'a_name' name, '20180226' release_date, 2179 visits_count UNION ALL
SELECT '20180227', 'a_name', '20180226', 9522 UNION ALL
SELECT '20180228', 'a_name', '20180226', 1593 UNION ALL
SELECT '20180301', 'a_name', '20180226', 300
)
SELECT name, release_date,
SUM(CASE WHEN date = release_date THEN visits_count END) count_release,
SUM(CASE WHEN PARSE_DATE('%Y%m%d', date) = DATE_ADD(PARSE_DATE('%Y%m%d', release_date), INTERVAL 1 DAY) THEN visits_count END) count_release_next_day,
SUM(CASE WHEN PARSE_DATE('%Y%m%d', date) > DATE_ADD(PARSE_DATE('%Y%m%d', release_date), INTERVAL 1 DAY) THEN visits_count END) count_release_rest
FROM `project.dataset.table`
GROUP BY name, release_date
or above can be "refactored" to avoid repeating PARSE_DATE, so query looks more compact and easier to manage
#standardSQL
WITH `project.dataset.table` AS (
SELECT '20180226' date, 'a_name' name, '20180226' release_date, 2179 visits_count UNION ALL
SELECT '20180227', 'a_name', '20180226', 9522 UNION ALL
SELECT '20180228', 'a_name', '20180226', 1593 UNION ALL
SELECT '20180301', 'a_name', '20180226', 300
)
SELECT name, release_date,
SUM(CASE WHEN date = release_date THEN visits_count END) count_release,
SUM(CASE WHEN visit = release_next_day THEN visits_count END) count_release_next_day,
SUM(CASE WHEN visit > release_next_day THEN visits_count END) count_release_rest
FROM `project.dataset.table`,
UNNEST([STRUCT<visit DATE, release_next_day DATE>(
PARSE_DATE('%Y%m%d', date),
DATE_ADD(PARSE_DATE('%Y%m%d', release_date), INTERVAL 1 DAY))]) x
GROUP BY name, release_date
in both cases result is
Row name release_date count_release count_release_next_day count_release_rest
1 a_name 20180226 2179 9522 1893
Upvotes: 0
Reputation: 91
There are lot of ways you can achieve this functionality. The simplest way to do it is compare date with case statement.
select name, sum(case when date = relese_date then 1 else 0) as release_count,
sum(case when date = DATE_ADD(relese_date,1,"DAY") then 1 else 0) as release_count1
sum(case when date > DATE_ADD(relese_date,1,"DAY") then 1 else 0) as release_count_other
Upvotes: 1