Reputation: 1
I am trying to calculate the revenue/dollar impact of removing certain customer from my table, any help would be appreciated!
This is the cleaned up table that I have -
Received_at | Customer | Click | Match_Bundle_ID | Revenue | Rank |
---|---|---|---|---|---|
1/1 | John | 1 | 12301 | 10.5 | 1 |
1/1 | Mike | 0 | 12301 | 9.2 | 2 |
1/1 | Mary | 0 | 12301 | 6.3 | 3 |
1/1 | Bob | 0 | 12301 | 6.1 | 4 |
1/2 | Lisa | 0 | 12302 | 7.6 | 1 |
1/2 | John2 | 0 | 12302 | 7.1 | 2 |
1/2 | Mary | 0 | 12302 | 5.9 | 3 |
1/2 | Mike | 0 | 12302 | 5.8 | 4 |
1/3 | John | 0 | 12303 | 6.9 | 1 |
1/3 | Mike | 1 | 12303 | 5.3 | 2 |
1/3 | Mary | 0 | 12303 | 5.1 | 3 |
1/3 | Bob | 0 | 12303 | 3.1 | 4 |
1/4 | Mike | 1 | 12304 | 8.9 | 1 |
1/4 | John | 0 | 12304 | 8.4 | 2 |
1/4 | John2 | 1 | 12304 | 7.8 | 3 |
1/4 | Bob | 0 | 12304 | 6.5 | 4 |
1/4 | Lisa | 0 | 12304 | 5.5 | 5 |
1/4 | Mary | 0 | 12304 | 3.1 | 6 |
1/5 | Bob | 0 | 12305 | 3.5 | 1 |
1/5 | John2 | 1 | 12305 | 2.3 | 2 |
1/5 | John | 0 | 12305 | 2.1 | 3 |
I would like to calculate the revenue impact of removing customer John and John2.
The assumptions are as follows -
Based on the above assumptions, after removing customer John and John2, the new table would look like this -
So the result should be:
Revenue before -
Revenue after removing John and John2 -
Therefore the revenue dollar impact would be $34.8 - $28.7 = $6.1
I am having trouble realizing this using SQL query, I'm hoping if anyone here can help with what is the best command to use here with this.
Thank you!
Upvotes: 0
Views: 134
Reputation: 2746
The query uses data from CTE (Common Table Expression), which can be replaced with actual table.
Query creates filtered data as per provided assumptions of pushing 'click' value up within a group of match_bundle_id after a customer is removed from that group.
First, it generates data set with valid 'click' i.e. click that should replace the originals after data is pushed up. Then, data set is generated with filtered customers without click column.
Above two data-sets are joined based on a generated key column.
Once we have filtered data-set and original data, they are used to get relevant SUMs.
with data_cte (received_at, customer, click, match_bundle_id,revenue,rank)
as
(select * from values // Replace with (select * from actual table)
('1-Jan','John',1,12301,10.5,1),
.
.
... Truncated...
('5-Jan','John',0,12305,2.1,3)
), cte_click_1 as ( // CTE including only relevant clicks
select match_bundle_id,click,rank,
count_if(customer not in ('John','John2'))
over (partition by match_bundle_id) cnt
from data_cte, table(generator(rowcount=>1))
qualify row_number()
over (partition by match_bundle_id order by rank) <= cnt
order by match_bundle_id
), cte_click_rank as ( // CTE with click and row_number to use for joining
select row_number() over(order by seq4())rn,
cte_click_1.* from cte_click_1, table(generator(rowcount=>1))
), cte_excluded_customer as ( // CTE with data without excluded customer
select row_number() over(order by seq4())rn,
received_at, customer, match_bundle_id,revenue
from data_cte, table(generator(rowcount=>1))
where customer not in ('John','John2')
), cte_filtered_customers as ( // CTE with final filtered customers
select a.received_at, a.customer, b.click, a.match_bundle_id,a.revenue
from cte_excluded_customer a, cte_click_rank b
where a.rn = b.rn
order by match_bundle_id
), cte_sum_orig as ( // CTE with SUM of original revenue
select match_bundle_id, sum(revenue) sum_o from data_cte
where click>0
group by match_bundle_id order by match_bundle_id
), cte_sum_revised as ( // CTE with SUM of revised revenue
select match_bundle_id, sum(revenue) sum_r from cte_filtered_customers
where click>0
group by match_bundle_id order by match_bundle_id
)
select sum(sum_o) - sum(sum_r) revenue_diff
from cte_sum_orig a left outer join cte_sum_revised b
on a.match_bundle_id = b.match_bundle_id
;
+--------------+
| REVENUE_DIFF |
|--------------|
| 6.1 |
+--------------+
Upvotes: 1