gbegley
gbegley

Reputation: 2679

Mysql query to produce 'bin'ed sums along with other aggregates in a group by query

I want to product a result set like

 select 
  customer
  bin_sum(sales,margin_pct,0.5), 
  count(1) txns, 
  sum(sales) total_sales, 
  sum(margin) total margin
 from sales_txns
 group by customer

Such that "bin_sum" will produce either an array or csv_list containing the customers aggregate according to 'bins' of 5 pct increments. The output of this column will contain the aggregate of sales havent 0-5% margin, 5-10% margin, 10-15% margin, etc, such that the output of this field is

customer1|"0,0,234.24,2632.45,4567.50,0,0,0,0,0,0,0"|37|783736.429|22443.23
customer2|"0,54.50,284.24,5632.45,2567.50,460,0,0,0,0,0,0"|37|783736.429|22443.23

The goal here to to be able to produce a graph of profit distribution by profit percentage from the second field. We've been iterating the sales records, and keeping a running tally of the fields in application logic, but I am wondering if a sql guru knows how to do this entirely in SQL.

Upvotes: 1

Views: 176

Answers (1)

Alex Wittig
Alex Wittig

Reputation: 2900

I'm assuming data that looks something like this:

| CUSTOMER |  SALE | MARGIN_PCT |
|----------|-------|------------|
|        1 | 10.12 |          2 |
|        1 | 12.99 |         39 |
|        1 | 20.95 |         16 |
|        1 | 80.00 |         18 |
|        2 | 94.99 |         17 |
|        2 | 92.98 |         70 |

First, we find which "bin" each transaction falls into:

select customer, sale, floor(margin_pct/5) as bin from sales

Then take those results and roll them up into bins per customer:

select customer, sum(sale) total_sales, bin
from
 (select customer, sale, floor(margin_pct/5) as bin from sales) customer_bins
group by customer, bin

Finally, take those results and roll them up per customer:

select customer, group_concat(total_sales separator ', ')
from
  (
   select customer, sum(sale) total_sales, pct_group
   from
    (select customer, sale, floor(margin_pct/5) as bin from sales) customer_bins
   group by customer, pct_group
   ) binned
group by customer

This gives

| CUSTOMER |       BINNED_MARGINS |
|----------|----------------------|
|        1 | 100.95, 12.99, 10.12 |
|        2 |         94.99, 92.98 |

We are almost there, but clearly we have a big problem. Unless you can guarantee that you will always have records that fall into every bin, your results won't make much sense.

Unfortunately, MySQL doesn't offer an elegant solution to this. Your best option is to make a range table to join against. Simply:

| BIN |
|-----|
|   0 |
|   1 |
|   2 |
|   3 |
|   4 |
|   5 |
|   6 |
|   7 |
|   8 |
|   9 |

Finally, by joining against the bin and customer tables we can force values for all bins:

select customer, group_concat(total_sales separator ', ') as binned_margins
from
  (
   select customer, sum(sale) total_sales, bin
   from
    (
      select customer, ifnull(sale, 0) sale, bin
      from bin
      inner join customer
      left join (select customer, sale, floor(margin_pct/5) as bin from sales) customer_sale_bins
      using(customer, bin)
      order by customer, bin 
    ) customer_bins
   group by customer, bin
   ) binned
group by customer

Produces:

| CUSTOMER |                                                 BINNED_MARGINS |
|----------|----------------------------------------------------------------|
|        1 | 10.12, 0.00, 0.00, 100.95, 0.00, 0.00, 0.00, 12.99, 0.00, 0.00 |
|        2 |    0.00, 0.00, 0.00, 94.99, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00 |

SQL Fiddle

Upvotes: 1

Related Questions