Matthew McNey
Matthew McNey

Reputation: 23

Can you nest a number of conditions in a partition by statement successfully?

I built this sample table in Fiddle: https://www.db-fiddle.com/f/pu1PkjzKCM9VRciznkJc4L/7

I am working on relabeling territories for account numbers and serial numbers. You can see the table schema and the query I have tried to change the data and add a new column that would have the correct territories. The problem is my result that you'll see at the end. I want to find a way to first go through serial_number, I have determined serial_numbers are the best way to track territory, then acct_num when serial_numbers are null to add proper territory names.

However, while my program largely works for this purpose. When there is an acct_num AND no serial_number it won't just throw in the last "proper" territory it found. I need the program to do that. I was thinking maybe a nested case statement? I have tried doing this in the preexisting CASE and PARTITION BY statements but I have been unsuccessful it keeps throwing a syntax error.

I was thinking... (PARTITION BY WHEN serial_number='NULL' THEN acct_num ELSE serial_number), but this always throws an error.

Any help is appreciated!

SQL SCHEMA

'''

CREATE TABLE t (
"row" INTEGER,
"acct_num" TEXT,
"serial_number" TEXT,
"site_address" TEXT,
"city" TEXT,
"territory_name" TEXT
);

INSERT INTO t
("row", "acct_num", "serial_number", "site_address", "city", "territory_name")
VALUES
('1', '049403', 'TH-530', '344 Clinton Str', 'Metropolis', 'Central'),
('2', '049403', 'TH-530', '344 Clinton Str', 'Metropolis', 'Central'),
('3', '049206', 'GO-650', '1007 Mountain Dr', 'Gotham City', 'Gotham City'),
('4', '049206', 'GO-650', '1007 Mountain Dr', 'Gotham City', 'Gotham City'),
('5', '049206', 'GO-650', 'NULL', 'NULL', 'N/A'),
('6', '049206', 'TJO-560', 'NULL', 'NULL', 'NULL'),
('7', '049291', 'SM-708', '1938 Sullivan Pl', 'Smallville', 'No Territory'),
('8', '049293', 'TH-533', '700 Hamilton Str', 'Central City', 'Central'),
('9', '049396', 'AL-670', '800 Ellsmore Way', 'Alberty Township', 'South'),
('10', '049396', 'KILO-680', '600 Nukem Drive', 'Duke City', 'Southeast'),
('11', '049396', 'JALO-605', '5806 London Way', 'Hampshire', 'Northeast'),
('12', '049396', 'NULL', 'NULL', 'NULL', 'No Territory');

'''

SQL QUERY

'''

 select row, 
   acct_num, 
   serial_number,
   site_address,
 MAX(
 CASE
 WHEN territory_name <> 'N/A' AND territory_name <> 'No Territory' 
 THEN territory_name
 WHEN serial_number = 'NULL' 
 THEN territory_name
 END) 
 over (PARTITION BY serial_number) as imputed_place
 from t
 order by row

'''

SQL Query Result

| row | acct_num | serial_number  | site_address    | imputed_place
| ----|----------|----------------|-----------------|---------------
| 1   | 049403   | TH-530         | 344 Clinton Str | Central
| 2   | 049403   | TH-530         | 344 Clinton Str | Central
| 3   | 049206   | GO-650         | 1007 Mountain Dr| Gotham City
| 4   | 049206   | GO-650         | 1007 Mountain Dr| Gotham City
|  5  | 049206   | GO-650         |  NULL           | Gotham City
|  6  | 049206   |  TJO-560       | NULL            |   NULL
|  7  | 049291   |  SM-708        | 1938 Sullivan Pl|   null
|  8  | 049293   |  TH-533        | 700 Hamilton Str| Central
|  9  | 049396   |  AL-670        | 800 Ellsmore Way|   South
| 10  | 049396   |  KILO-680      | 600 Nukem Drive |   Southeast
| 11  | 049396   |  JALO-605      | 5806 London Way |   Northeast
| 12  | 049396   |  NULL          | NULL            |   No Territory <- should be Northeast!!!

'''

DB Fiddle – Crafted with ♥ by Status200 in the United Kingdom. Terms of Use • Privacy / Cookie Policy • Status200 Ltd © 2018

Upvotes: 0

Views: 286

Answers (1)

patrick_at_snowflake
patrick_at_snowflake

Reputation: 453

I tried to recreate your result (row 12 included), is the column row something that is part of the solution / data?

The approach I took is to create a mapping table of valid values by serial number, therefore key+value must be unique or we explode the results from 12 rows to... a lot more! That works fine for Serial number, when I look at account number that is not the case. There is no unique mapping, what is the rule that determines that line 12 should be 'Northeast'?

If it is based on row then the outcome is easy and I'll provide the solution below,

with map_by_serial_no as (
select distinct "serial_number"
  , "territory_name"
  from t
  where "serial_number" <> 'NULL'
    and "territory_name" <> 'N/A'
  and "territory_name" <> 'NULL'
) 
select t."row"
, t."acct_num"
, t."serial_number"
, t."site_address"
, t."city"
, t."territory_name"
, case when t."serial_number" = 'NULL' then lag(t."territory_name") over (order by "row")
   when t."territory_name" in ('NULL', 'No Territory', 'N/A') then u."territory_name"
   else t."territory_name" end as imputed_territory
from t
left join map_by_serial_no u
on t."serial_number" = u."serial_number"
order by "row";

Note however that NULL should be the absence of a value, whereas in this sample it is a quoted string.

Upvotes: 1

Related Questions