Kenkuts
Kenkuts

Reputation: 69

Getting a percentage of products purchased from certain groups of customers

The problem is simple say we have a table with customers, purchases and products. Customers(cust_id, name, state) contain all the user data, purchases contain the purchase data(quantity, price, prod_id, cust_id) and product contain the product info(prod_id, description).

Let's say I have 10 customers, 6 out of ten of them bought shoes 2 out of 6 of them bought laces.

My goal is to get those 6 customers who bought the shoes then get the percentage of those 6 clients who bought laces. So my percentage should be around 33.33%.

I tried to do this multiple times with INNER JOINS but it doesn't seem that I am doing it right. My query for this simple problem is really messy and I have tried to fix this for the past couple of days. PS I am new to SQL queries. I never did these complex queries.

WITH state_product(customers_id, products_id) AS (
    SELECT DISTINCT customers.id, products.id 
    FROM customers 
    INNER JOIN purchases ON purchases.customer_id = customers.id 
    INNER JOIN products ON purchases.product_id = products.id 
    WHERE products.id = 7
), WITH specific_product(customers_id, products_id) AS (
    SELECT DISTINCT customers.id, products.id from customers 
    INNER JOIN purchases ON purchases.customer_id = customers.id 
    INNER JOIN products ON purchases.product_id = products.id 
    INNER JOIN state_product ON customers.id = 
    state_product.customers_id WHERE products.id = 8),
SELECT SUM(*)/COUNT(state_product.customer_id)*100 
AS Percentage 
FROM specific_product;

My logic when I made this code was get all the unique customers.id who bought the shoes in a table with their products.id PK for shoes is 7 and call that table state_product.

Then get another table with the customers from within that state_product and just get the customers who bought shoelaces products.id = 8 as specific_product. Which should give me two customers.id.

Now comes getting the percentage I just get the sum of the specific_product records which is two then divide that by the sum of the state_product then multiply by 100 then put that in a table called percentage. Which would be (2/6)*100 = 33.33%

I am open to an easier way of solving this problem my issue is that I need more time learning SQL queries and my logic when it comes to using which steps to solve a problem using queries is not as clear as if I were to use ruby or JS to solve this. Constructive criticism is encouraged.

Upvotes: 1

Views: 795

Answers (4)

Gordon Linoff
Gordon Linoff

Reputation: 1270643

One method is:

select avg( (cnt_8 > 0)::int ) as ratio_8_to_7
from (select pc.customer_id,
             count(*) filter (where p.product_id = 7) as cnt_7,
             count(*) filter (where p.product_id = 8) as cnt_8
      from purchases p
      where p.product_id in (7, 8)
      group by p.customer_id
     ) pc
where cnt_7 > 0;

What is this doing? The inner subquery summarizes for each customer the number of purchases that that customers has for each product. Because you only care about these two products, the where clause filters everything else out.

The outer query then calculates the ratio of product "8" to product "7" among the customers who purchases "7".

Very, very important: this counts customers and not purchases. So, customers who have multiple purchases of either laces or shoes are counted only once. That is my interpretation of your question.

Upvotes: 0

Vijiy
Vijiy

Reputation: 1197

Below would be the query

Data -

Customers
c1      cust1   KA
c2      cust3   KA
c3      cust3   KA
c4      cust4   KA
c5      cust5   KA
c6      cust6   KA

Purchases

cust_id prod_id quantity price
c1      P1      1       10
c1      P2      1       2
c2      P1      1       10
c2      P2      1       2
c3      P1      1       10
c3      P2      2       2
c4      P1      1       10
c4      P2      1       2
c5      P1      2       10

Product Details

prod_id prod_desc
P1      shoes
P2      laces

Your query will be -

select
 --c.cust_id, p.prod_id, pd.prod_desc
 (sum(case when pd.prod_desc='laces' then 1 else 0 end)/
 sum(case when pd.prod_desc='shoes' then 1 else 0 end)) * 100
from customers c
inner join purchases p on c.cust_id=p.cust_id
inner join product pd on p.prod_id=pd.prod_id
where pd.prod_desc in ('shoes', 'laces');

If you want to calculate based on quantity, what if customer brought 2 shoes and 1 laces

select
 --c.cust_id, p.prod_id, pd.prod_desc
 (sum(case when pd.prod_desc='laces' then quantity else 0 end)/
 sum(case when pd.prod_desc='shoes' then quantity else 0 end)) * 100
from customers c
inner join purchases p on c.cust_id=p.cust_id
inner join product pd on p.prod_id=pd.prod_id
where pd.prod_desc in ('shoes', 'laces');

Upvotes: 0

Radim Bača
Radim Bača

Reputation: 10711

Use conditional aggregation.

select 100 * count(*) /
       count(
         case when prod_id = 2 -- lace prod_id
              then 1
         end
       ) percent
from purchases
where prod_id = 1 -- shoe prod_id

Upvotes: 0

Jorge Lopez
Jorge Lopez

Reputation: 467

Problem

  • Let's say I have 10 customers, 6 out of ten of them bought shoes 2 out of 6 of them bought laces.

  • My goal is to get those 6 customers who bought the shoes then get the percentage of those 6 clients who bought laces. So my percentage should be around 33.33%.

User Input

  • -- 3 tables

  • --Customers(cust_id, name, state)

  • -- purchase (quantity, price, prod_id, cust_id)
  • --product info(prod_id, description)

List of stuff they bought.

  select 
    b.cust_id,c.description as product
    into #temp
    from purchase a join customers b on a.cust_id = b.cust_id
    join product_info c on a.prod_id = c.prod_id
    where c.description in ('shoes','laces')

Now the query logic

select 
    t1.cust_id,
    sum(case 
       when t2.cust_id is null then 0
       else 1
    end) totalCustomersWithLaces
into #t2
    from      
         (
            --List of customers who bought shoes
            select distinct cust_id from #temp
            where product = "shoes"
         )t1 left join 
         (
        --List of customers who bought laces
            select distinct cust_id from #temp
            where product = "laces"
         )t2 on t1.cust_id = t2.cust_id

Finally just get your result

select sum(totalCustomersWithLaces)/cast(count(1) as float) from #t2

Upvotes: 2

Related Questions