Chuck
Chuck

Reputation: 1293

in SQL, how to remove distinct column values (not rows, as usually done)

I have a production case, for a supply chain. We have devices that are moved around in warehouses, and I need to find the previous warehouse locations.

I have a table like this:

+--------+------------+--------+--------+--------+
| device | current_WH | prev_1 | prev_2 | prev_3 |
+--------+------------+--------+--------+--------+
|      1 | AB         | KK     | KK     | KK     |
|      2 | DE         | DE     | DE     | NQ     |
|      3 | FF         | MM     | ST     | ST     |
+--------+------------+--------+--------+--------+

I need to find the distinct values of current_WH and the "prev" columns. So I'm not flattening rows, but narrowing columns. I need to get this:

+--------+------------+--------+--------+--------+
| device | current_WH | prev_1 | prev_2 | prev_3 |
+--------+------------+--------+--------+--------+
|      1 | AB         | KK     | blank  | blank  |
|      2 | DE         | NQ     | blank  | blank  |
|      3 | FF         | MM     | ST     | blank  |
+--------+------------+--------+--------+--------+

I'll figure out nulls or blanks later. But for now I need one row for each device that shows the current WH and previous locations. There could be any number - not always the same.

If I do "distinct" that flattens rows. Doing a distinct and group by doesn't achieve the requirement.

Any help is appreciated. Thanks!

Upvotes: 1

Views: 95

Answers (1)

D-Shih
D-Shih

Reputation: 46219

You need to do unpivot to let your column value rows, because that will easier to compare before current_WH value data, then do a pivot to recover the data schema.

  1. Do unpivot to let your column value rows, because that will easier to compare before current_WH value data, and add a new grp column it can help to recover your expected result.
  2. use LAG function to get the previous value it will be compared with current_WH value.
  3. use SUM with CASE WHEN and window function to cumulative number if the previous equal to current_WH value.
  4. if the SUM cumulative number greater than 0 means the name was repeated.

look like this.

with cteUnion as( 
    SELECT device,current_WH,0 grp
    FROM T
    UNION ALL
    SELECT device,prev_1,1 grp
    FROM T
    UNION ALL
    SELECT device,prev_2,2 grp
    FROM T
    UNION ALL
    SELECT device,prev_3,3 grp
    FROM T
),cte1 as(
  SELECT *,
         LAG(current_WH) over(partition by current_WH order by grp) perviosVal
   from cteUnion
),cteResult as (
  SELECT *,
        (CASE WHEN sum(CASE WHEN perviosVal = current_WH  then 1 else 0 end) over(partition by device order by grp) > 0 THEN 'Block' else current_WH end) val
  FROM cte1
)

select device,
      MAX(CASE WHEN grp = 0 then val end) current_WH ,
      MAX(CASE WHEN grp = 1 then val end) prev_1,
      MAX(CASE WHEN grp = 2 then val end) prev_2,
      MAX(CASE WHEN grp = 3 then val end) prev_3
from cteResult
GROUP BY device

sqlfiddle

NOTE

grp column number value depends on your order.

Upvotes: 1

Related Questions