fasth
fasth

Reputation: 2342

Aggregate after join without duplicates

Consider this query:

select
   count(p.id),
   count(s.id),
   sum(s.price)
from 
   (select * from orders where <condition>)   as s,
   (select * from products where <condition>) as p
where 
   s.id = p.order;

There are, for example, 200 records in products and 100 in orders (one order can contain one or more products).

I need to join then and then:

  1. count products (should return 200)
  2. count orders (should return 100)
  3. sum by one of orders field (should return sum by 100 prices)

The problem is after join p and s has same length and for 2) I can write count(distinct s.id), but for 3) I'm getting duplicates (for example, if sale has 2 products it sums price twice) so sum works on entire 200 records set, but should query only 100.

Any thoughts how to sum only distinct records from joined table but also not ruin another selects?

Example, joined table has

id sale price
0  0    4
0  0    4
1  1    3
2  2    4
2  2    4
2  2    4

So the sum(s.price) will return:

4+4+3+4+4+4=23

but I need:

4+3+4=11

Upvotes: 1

Views: 3521

Answers (3)

Abercrombieande
Abercrombieande

Reputation: 699

Your main problem is with table design. You currently have no way of knowing the price of a product if there were no sales on it. Price should be in the product table. A product cost a certain price. Then you can count all the products of a sale and also get the total price of the sale.

Also why are you using subqueries. When you do this no indexes will be used when joining the two subqueries. If your joins are that complicated use views. In most databases they can indexed

Upvotes: 0

Thorsten Kettner
Thorsten Kettner

Reputation: 94859

You are not interested in single product records, but only in their number. So join the aggregate (one record per order) instead of the single rows:

select
  count(*) as count_orders,
  sum(p.cnt) as count_products,
  sum(s.price)
from orders as s
join 
(
  select order, count(*) as cnt 
  from products 
  where <condition> 
  group by order
) as p on p.order = s.id
where <condition>;

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

If the products table is really more of an "order lines" table, then the query would make sense. You can do what you want by in several ways. Here I'm going to suggest conditional aggregation:

select count(distinct p.id), count(distinct s.id),
       sum(case when seqnum = 1 then s.price end)
from (select o.* from orders o where <condition>) s join
     (select p.*, row_number() over (partition by p.order order by p.order) as seqnum
      from products p
     where <condition>
     ) p
     on s.id = p.order;

Normally, a table called "products" would have one row per product, with things like a description and name. A table called something like "OrderLines" or "OrderProducts" or "OrderDetails" would have the products within a given order.

Upvotes: 1

Related Questions