Hamza
Hamza

Reputation: 6025

Sum of one column grouped by 2nd column with groups made based on 3rd column

Data

So my data looks like:

product user_id value   id
pizza   1   50  1
burger  1   30  2
pizza   2   50  3
fries   1   10  4
pizza   3   50  5
burger  1   30  6
burger  2   30  7

Problem Statement

And I wanted to compute Lifetime values of customers of each product as a metric to know which product is doing great in terms of user retention.

Desired Output

My desired output is:

product value_by_customers_of_these_products total_customers ltv
pizza 250 3 250/3 = 83.33
burger 200 2 200/2 = 100
fries 120 1 120/1 = 120

Columns Description:

Current Workaround

Currently I am doing this:

SELECT "pizza" AS product, SUM(value) value_by_customers_of_these_products, COUNT(DISTINCT user_id) users FROM orders WHERE user_id in (SELECT user_id FROM orders WHERE product = "pizza")
UNION ALL 
SELECT "burger" AS product, SUM(value) value_by_customers_of_these_products, COUNT(DISTINCT user_id) users FROM orders WHERE user_id in (SELECT user_id FROM orders WHERE product = "burger")
UNION ALL 
SELECT "fries" AS product, SUM(value) value_by_customers_of_these_products, COUNT(DISTINCT user_id) users FROM orders WHERE user_id in (SELECT user_id FROM orders WHERE product = "fries")

I have a python script obtaining DISTINCT product names from my table and then repeating the query string for each product and updating query from time to time. This is really a pain as I have to do every time a new product is launched and sky-rocketing length of query is another issue. How can I achieve this via built-in BigQuery functions or minimal headache?

Code to generate Sample Data

WITH orders as (SELECT "pizza" AS product,
                1 AS user_id,
                50 AS value, 1 AS id,
                UNION ALL SELECT "burger", 1, 30,2
                UNION ALL SELECT "pizza", 2, 50,3
                UNION ALL SELECT "fries", 1, 10,4
                UNION ALL SELECT "pizza", 3, 50,5
                UNION ALL SELECT "burger", 1, 30, 6
                UNION ALL SELECT "burger", 3, 30, 7) 

Upvotes: 0

Views: 549

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Use below

with user_value as (
  select user_id, sum(value) values
  from `project.dataset.table`
  group by user_id
), product_user as (
  select distinct product, user_id
  from `project.dataset.table`
)
select product, 
  sum(values) as value_by_customers_of_these_products,
  count(user_id) as total_customers,
  round(sum(values) / count(user_id), 2) as ltv
from product_user
join user_value
using(user_id)
group by product      

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

Related Questions