tember
tember

Reputation: 1496

Group by with left outer join, exclude nulls

I have an order table and a trip table that includes payment information. It is a many to many relationship - order can split across many trips and one trip could have payment info for a couple of orders and sometimes an order. There is no "Zero" record in the Trips table - so a left join with that value as the key will return a NULL record. I am using SQL 2012

 Order table
+----+----------+--------------+
| order_id | trip_nbr | veh_id |
+----+----------+--------------+
|  1       | 12       | 3   |
|  2       | 22       | 6   |
|  2       | 0        | 8   |
|  4       | 25       | 7   |
|  7       | 0        | 11  |
+----+----------+--------------+

Trips table
+----+------------+--------------+
| trip_nbr | payment     | veh_id |
+----+------------+--------------+
|  12      | 20.00       | 3   |
|  22      | 123.00      | 6   |
|  22      | 12.50       | 6   |
|  25      | 133.33      | 7   |
+----+------------+--------------+

Here is my query:

 select o.order_id, 
   t.trip_nbr,    
   sum(t.payment_amt)
 from orders o
 left outer join trips t
    on o.trip_nbr = t.trip_nbr
 group by  o.order_id, 
    t.trip_nbr

Results:

+----+----------+--------------+
| order_id | trip_nbr | sum     |
+----+----------+--------------+
|  1       | 12       | 20.00   |
|  2       | 22       | 135.50  |
|  2       | NULL     | NULL    |
|  4       | 25       | 133.33  |
|  7       | NULL     | NULL    |
+----+----------+--------------+

The problem is, I get lots of information from the orders table and just the payment info from the Trips table. So I don't want to exclude any order records (which would happen if I add the clause "WHERE t.trip_nbr is NOT NULL") - but I don't want to get 2 records in my grouping - one for t.trip_nbr is NULL and one where it finds a match.

Desired results:

+----+----------+--------------+
| order_id | trip_nbr | sum     |
+----+----------+--------------+
|  1       | 12       | 20.00   |
|  2       | 22       | 135.50  |
|  4       | 25       | 133.33  |
|  7       | NULL     | NULL    |
+----+----------+--------------+

I want the unmatched record order_id = 2 to be "summarized away" - but keep the lone record for order_id = 7. The reason is that this table is later join with another table and the extra NULL records are creating duplicates.

Upvotes: 2

Views: 1996

Answers (3)

M T Head
M T Head

Reputation: 1290

If you convert your nulls to zero, then sum "trip_nbr" and "sum" for a given order_id. Would this not solve your challenge?

create table #Order (Order_Id int , Trip_nbr int , Veh_id int ) 

Create Table #Trips (trip_nbr int , Payment Numeric(13,2), Veh_id int )


insert into #Order (Order_id, Trip_nbr, Veh_id)  values (1,12,3)
insert into #Order (Order_id, Trip_nbr, Veh_id)  values (2,22,6)
insert into #Order (Order_id, Trip_nbr, Veh_id)  values (2,0 ,8)
insert into #Order (Order_id, Trip_nbr, Veh_id)  values (4,25,7)
insert into #Order (Order_id, Trip_nbr, Veh_id)  values (7,0,11)

insert into #Trips (trip_nbr, Payment, Veh_id)   values (12, 20.00 , 3 )
insert into #Trips (trip_nbr, Payment, Veh_id)   values (22, 123.00,6 )
insert into #Trips (trip_nbr, Payment, Veh_id)   values (22, 12.50 , 6 )
insert into #Trips (trip_nbr, Payment, Veh_id)   values (25, 133.33 , 7 )

select Order_id, trip_nbr = sum(trip_nbr), Payment = sum(payment)
from
(
select o.order_id, 
        t.trip_nbr,    
    Payment = sum(t.Payment)
from #order o 
      left outer join #trips t on t.trip_nbr = o.trip_nbr
      -- left outer join #order o on t.trip_nbr = o.trip_nbr
group by  o.order_id,     t.trip_nbr
) x 
group by Order_id 
order by Order_id

Upvotes: -1

Giorgos Betsos
Giorgos Betsos

Reputation: 72165

You can use a window function like RANK to identify superfluous NULL- valued records and filter them out in an outer query:

select order_id, 
       trip_nbr,    
       total_payment
from (
  select o.order_id, 
         t.trip_nbr,    
         sum(t.payment) as total_payment,
         rank() over (partition by order_id 
                      order by case 
                                  when t.trip_nbr IS NULL then 2
                                  else 1
                               end) as rnk
  from orders o
  left outer join trips t
      on o.trip_nbr = t.trip_nbr
  group by  o.order_id, t.trip_nbr) as t
where t.rnk = 1

Upvotes: 1

Lamak
Lamak

Reputation: 70638

This should work:

WITH orders2 AS
(
    SELECT  *,
            N = SUM(CASE WHEN trip_nbr <> 0 THEN 1 ELSE 0 END) OVER(PARTITION BY order_id)
    FROM orders
)
SELECT  o.order_id,  
        t.trip_nbr,
        SUM(t.payment_amt)
FROM orders2 o
LEFT OUTER JOIN trips t
    ON o.trip_nbr = t.trip_nbr
WHERE N = 0 OR (N > 1 AND o.trp_nbr <> 0)
GROUP BY o.order_id,
         t.trip_nbr;

Upvotes: 1

Related Questions