RohitM
RohitM

Reputation: 137

Count of User On and Before date

I have below data table in SQL.

Campaign_id User_id Campaign_date
C_1         1    2020-01-01    
C_1         2    2020-01-01
C_1         3    2020-01-01
C_1         4    2020-01-01
C_1         5    2020-01-01
C_2         2    2020-02-01
C_2         7    2020-02-01
C_2         8    2020-02-01
C_2         3    2020-02-01
C_2         5    2020-02-01
C_2         11   2020-02-01 
C_3         7    2020-03-01
C_3         2    2020-03-01
C_3         12   2020-03-01
C_3         4    2020-03-01
C_4         1    2020-04-01
C_4         2    2020-04-01
C_4         4    2020-04-01
C_4         15   2020-04-01  
C_4         7    2020-04-01

I want the results like below table.

Campaign_id sent_2_times sent_3_times sent_4_times
C_1             0           0           0
C_2             3           0           0
C_3             4           1           0
C_4             4           2           1

Basically I want to count number of Unique users who got campaign 2,3,4 times in past. (before that particular campaign).

  1. For C_1 since that is the first campaign there will not be any repeated user_IDs, Therefore 0 in each column of output table.
  2. For C_2 As we can see user_id(2,3 & 5) are repeating 2 times on or before C_2's Campaign_date.
  3. For C_3 user_id(3,4,5,7) are repeating 2 times on or before C_3's Campaign_date. And user_id(2) is repeating 3 times on or before C_3's campaign_date.

Can anyone help me out finding the solution in BigQuery SQL.

TIA, Rohit

Upvotes: 0

Views: 41

Answers (1)

Samuel
Samuel

Reputation: 3528

I cannot replicate your example. Please check your example data as well.

If an user comes back 4 times, you want it to count to all sent_._times columns.

with tbl as (
Select "C_1" as Campaign_id , date "2020-01-01" as Campaign_date, * from unnest([1,2,3,4,5]) as user_id
union all   Select "C_2" as Campaign_id ,date "2020-02-01", * from unnest([2,7,8,3,5,11]) as user_id
union all   Select "C_3" as Campaign_id ,date "2020-03-01", * from unnest([7,2,12,4]) as user_id
union all   Select "C_4" as Campaign_id ,date "2020-04-01", * from unnest([1,2,4,15,7]) as user_id
),
tbl2 as  (
Select * , 
  count(1) over prev as taken_part_prev
 from tbl
 window prev as (partition by user_id order by unix_date(Campaign_date) range between unbounded preceding and 0 preceding)
),tbl3 as (
Select Campaign_id,Campaign_date, taken_part_prev, count(1) as counts,
  from tbl2
group by 1,2,3
union all select Campaign_id, Campaign_date  ,taken_part_prev , 0 from tbl2,unnest(generate_array(0,(Select count(distinct Campaign_id) from tbl2))) taken_part_prev  group by 1,2,3,4
), tbl4 as (
select Campaign_id,
    taken_part_prev,
  sum(counts) over summing as  counts_sum
 from tbl3
window summing as (partition by Campaign_id order by taken_part_prev  range between  0 preceding and unbounded following)
order by  taken_part_prev
)
Select 
*
 from tbl4
pivot(max(counts_sum) for 'sent_' ||taken_part_prev || '_times'  in ("sent_2_times","sent_3_times","sent_4_times") )
order by 1

Upvotes: 0

Related Questions