Peter Johnson
Peter Johnson

Reputation: 23

SQL non-duplicate join to table which uses nulls as wildcards

I'm querying tables which use a null value as wildcard, this can be overridden if another row contains a non-wildcard value. E.g.:

Table 1: Customers - containing customers and associated products:

create table #cust (
    id int not null,
    product varchar(3) not null
)

insert into #cust
values (1,'ABC'),(1,'DEF'),(1,'GHI')

============
id | product
------------
1  | ABC
1  | DEF
1  | GHI
============

Table 2: Rebates - contains the rebate customers receive for each product. A null product field specifies a default rebate, to be applied to all products except any which are overtly specified:

create table #rebate (
    cust_id int not null,
    product varchar(3) NULL,
    rebate numeric(5,2) not null
)

insert into #rebate
values (1,null,0.25),(1,'ABC',0.05)

==========================
cust_id | product | rebate
--------------------------
1       | null    | 0.25
1       | ABC     | 0.05
==========================

So this customer a receives a 25% rebate for all products except 'ABC' for which they receive 5%.

I was trying to code it in a simple way like this:

select * 
from #cust c
left join #rebate r
    on c.id = r.cust_id
        and c.product = isnull(r.product, c.product)

However, the result of this is duplication on the ABC product (matching the r.product is null, and the r.product = c.product parts of the join):

======================================
id  product cust_id  product    rebate
--------------------------------------
1   ABC     1        NULL       0.25    -- duplication
1   DEF     1        NULL       0.25
1   GHI     1        NULL       0.25
1   ABC     1        ABC        0.05    -- duplication. This is the row needed
=======================================

Any suggestions?

Upvotes: 2

Views: 92

Answers (2)

Thorsten Kettner
Thorsten Kettner

Reputation: 94884

You want ranking. There are general rebates and particular rebates, and you prefer the particular over the general one, if it exists. There are several ways to approach this and they mostly boil down to selecting the candidates (which is one or two rows per customer) and then pick the better one. Here is a solution with a lateral join (called OUTER APPLY in SQL Server):

select * 
from #cust c
outer apply
(
  select top(1) * -- TOP(1) picks the row brought up first by ORDER BY
  from #rebate r
  where r.cust_id = c.id
  and (r.product = c.product or r.product is null)
  order by r.product desc -- DESC orders NULLs last in SQL Server
) rr;

SQL Server lacks a NULLS LAST clause for ORDER BY, but you get nulls last by sorting decending.

Upvotes: 0

Marc Guillot
Marc Guillot

Reputation: 6455

You can join join the rebates conditionally. You first join the specific rebates, and then join the default rebates for the products where no specific rebate has been found.

select c.*, coalesce(specific_r.rebate, default_r.rebate) as rebate
from #cust c
     left join #rebate specific_r on 
               c.id = specific_r.cust_id and c.product = specific_r.product
     left join #rebate default_r on 
               c.id = default_r.cust_id and default_r.product is null and 
               specific_r.cust_id is null  -- No specific rebate found

This gives you the desired result :

==================
id  product rebate
------------------
1   ABC     0.05
1   DEF     0.25
1   GHI     0.25
==================

Upvotes: 1

Related Questions