Gdaimon
Gdaimon

Reputation: 271

How to eliminate duplicates without losing the order

I have a table which dynamically, I associate taxes with their concept, but the query can return several concepts and repeated taxes, but in the end I must delete the records respected

WITH test_data AS
(
  --PRODUCT | VALUE
  --:------ | ----:
  SELECT 125  AS ord, 'Ties' AS product, 'P' AS concept FROM dual UNION ALL
  SELECT 127  AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 345  AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 346  AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 58   AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 59   AS ord, 'tax',     'P'    FROM dual UNION ALL  
  SELECT 723  AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 724  AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 95   AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 96   AS ord, 'tax',     'P'    FROM dual UNION ALL
  SELECT 1102 AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 1103 AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 366  AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 367  AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 1555 AS ord, 'Pants',   'T'    FROM dual UNION ALL
  SELECT 1556 AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 1787 AS ord, 'Stocks',  'T'    FROM dual UNION ALL
  SELECT 1788 AS ord, 'tax',     'T'    FROM dual UNION ALL
  SELECT 197  AS ord, 'Shirts',  'P'    FROM dual UNION ALL
  SELECT 198  AS ord, 'tax',     'P'    FROM dual
), 
test_data_extended AS
(
  SELECT product, 
         concept, 
         LAG(product, 1) OVER (ORDER BY ord) AS pre_product,ord
  FROM test_data
), 
test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || pre_product || ')' 
         ELSE product 
         END AS new_product
  FROM test_data_extended
), 
new_data AS
(
  SELECT UNIQUE ord, 
         new_product, 
         concept
  FROM test_data_new
order by concept
)SELECT * FROM   new_data

I have found a possible solution, where I can eliminate the duplicates but lose the order, the products should be left with their tax in the next row:

test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || pre_product || ')' 
         ELSE product 
         END AS new_product,
         CASE
         CASE WHEN product = 'tax' THEN 
              1
         ELSE 0 
         END AS id_d
  FROM test_data_extended
), 
new_data AS
(
  SELECT UNIQUE ord, 
         new_product, 
         concept,
         id_d
  FROM test_data_new
  order by concept,id_d
)SELECT * FROM   new_data

The expected result would be something similar to this:

 NEW_PRODUCT 
 -----------:
 Shirts      
 tax (Shirts)
 Ties        
 tax (Ties)  
 Pants       
 tax (Pants) 
 Stocks      
 tax (Stocks)

dbfiddle

Upvotes: 0

Views: 48

Answers (2)

Gdaimon
Gdaimon

Reputation: 271

This is a possible solution, I tried it in several ways and I could achieve it, I share it in case someone could need it

WITH test_data AS
(
...
), 
test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || pre_product || ')' 
         ELSE product 
         END AS new_product
  FROM test_data_extended
), 
new_data AS
(
  SELECT UNIQUE new_product, 
         concept,
         ord
  FROM test_data_new
  ORDER BY concept,ord
),
order_data AS 
(
  SELECT new_product, 
         ord,
         concept
    FROM new_data 
ORDER BY 2
),
filter_data AS 
(
  SELECT new_product, 
         MIN(ord) ord,
         concept
    FROM order_data 
GROUP BY new_product, concept
ORDER BY 2
)
SELECT new_product
  FROM filter_data 

dbfiddle

Upvotes: 0

Alex Poole
Alex Poole

Reputation: 191235

Based on your data and some assumptions - particularly that the ord values can't be interleaved, even where you have gaps between an item and its (hopefully) related tax - you could do:

WITH test_data AS
(
...
), 
test_data_new AS
(
  SELECT product AS old_product, 
         concept,
         ord, 
         CASE WHEN product = 'tax' THEN 
              'tax (' || LAG(product, 1) OVER (ORDER BY ord) || ')' 
         ELSE product 
         END AS new_product,
         CASE WHEN product = 'tax' THEN 
              LAG(ord, 1) OVER (ORDER BY ord) 
         ELSE ord
         END AS new_ord,
         CASE WHEN product = 'tax' THEN 
              1
         ELSE 0 
         END AS id_d
  FROM test_data
)
SELECT new_product
FROM test_data_new
GROUP BY new_product, id_d
ORDER BY min(ord), id_d;

NEW_PRODUCT 
------------
Shirts
tax (Shirts)
Ties
tax (Ties)
Stocks
tax (Stocks)
Pants
tax (Pants)

8 rows selected. 

I've taken out two levels of CTE you didn't really need, but the main change is to add another lag() to tie a tax item to the same ord as the non-tax item that precedes it.

This all seems a bit fragile still, but works with your data at least.

db<>fiddle

Upvotes: 2

Related Questions