JustBeingHelpful
JustBeingHelpful

Reputation: 18980

Query to filter out transactions with exact offset amount

The query below is a rough draft containing the relevant columns in the query I'd like to write, so don't look at it as a solution. Use it as a guide for the table and column names. I'm trying to remove any transactions that offset each other for the same ORDER_ID and ACCOUNT_ID. I don't think I can do an aggregate using SUM, since that would add together all TX_AMOUNT values for the grouping. See TX_ID 6 and 7. Those both need to show in the result set. How can I output the TX_ID from the table below, and filter out anything that doesn't say "SHOW THIS"?

SELECT 
T1.ACCOUNT_ID
T1.ORDER_ID,
T1.TX_ID
FROM TRANSACTION AS T1
WHERE
T1.ACCOUNT_ID IN (
SELECT T2.ACCOUNT_ID
FROM TRANSACTION AS T2
GROUP BY T2.ACCOUNT_ID, T2.ORDER_ID
HAVING SUM(T2.TX_AMOUNT) != 0 AND T2.ORDER_ID IS NOT NULL 
)
AND T1.ORDER_ID IN (
SELECT T3.ORDER_ID
FROM TRANSACTION AS T3
GROUP BY T3.ACCOUNT_ID, T3.ORDER_ID
HAVING SUM(T3.TX_AMOUNT) != 0 AND T3.ORDER_ID IS NOT NULL 
)

TX_ID   ORDER_ID ACCOUNT_ID  TX_AMOUNT
------------------------------------
1       A1       200         -3.00  <--------- DON'T SHOW THIS; OFFSET BY #2
2       A1       200         3.00   <--------- DON'T SHOW THIS; OFFSET BY #1
3       A1       200         3.00   <--------- SHOW THIS
4       A2       999         -10.01 <--------- DON'T SHOW THIS; OFFSET BY #5
5       A2       999         10.01  <--------- DON'T SHOW THIS; OFFSET BY #4
6       A2       999         10.01  <--------- SHOW THIS
7       A2       999         5.02   <--------- SHOW THIS

Upvotes: 1

Views: 324

Answers (1)

xQbert
xQbert

Reputation: 35333

VERSION 2: MUCH cleaner... Working DEMO with comments (you may need to click Run it!) to see desired results (or maybe I have a caching problem)

  • The CTE (Common Table Expression) is just your data setup that you show
  • The CTE2 simply adds a row number partitioned by the tx_amount, order_Id, account_Id. Key here is the fact that we get a row_number for each order_ID, Account_ID and tax_Amount restarting when those 3 values change but incrementing when they stay the same. This later allows us to exclude like matches on opposite tx_amounts without eliminating those when one side has more than the other (your $3.00 example)
  • The Select simply pulls in records from the base set where the opposite value with the same row number order_id and account exist. if one doesn't then we know it's a value w/o a matching opposite tx_Amount and thus one we want to keep.
  • Ask if you have questions! Happy to help if something's not clear
  • Lastly if we change CTE2 so the rowNumber() is ordered by tx_ID asc instead of tx_Amount desc (which really serves no purpose other than I needed an order by on the row_nubmer), then we'd get rid of the lowest number matches first consistently following the FIFO approach)

.

With CTE (TX_ID,   ORDER_ID, ACCOUNT_ID,  TX_AMOUNT) as (

SELECT 1,       'A1',       200,         -3.00  UNION ALL
SELECT 2,       'A1',       200,         3.00   UNION ALL
SELECT 3,       'A1',       200,         3.00   UNION ALL
SELECT 4,       'A2',       999,         -10.01 UNION ALL
SELECT 5,       'A2',       999,         10.01  UNION ALL
SELECT 6,       'A2',       999,         10.01  UNION ALL
SELECT 7,       'A2',       999,         5.02 ),

cte2 as (
SELECT A.*, row_number() over (partition by order_ID, Account_ID, Tx_Amount order by tx_Amount desc) RN
FROM cte A)

SELECT * 
FROM cte2 A
WHERE NOT exists (SELECT * 
                  FROM cte2 B
                  WHERE A.Order_ID = B.Order_ID
                    and A.Account_ID = B.Account_Id
                    and A.tx_Amount*-1 = B.tx_Amount
                    and A.RN = B.RN)

Giving us: (note we should eliminate the RN by changing * to desired fields but i'm too lazy at this point)

+----+-------+----------+------------+-----------+----+
|    | TX_ID | ORDER_ID | ACCOUNT_ID | TX_AMOUNT | RN |
+----+-------+----------+------------+-----------+----+
|  1 |     2 | A1       |        200 |      3,00 |  2 |
|  2 |     7 | A2       |        999 |      5,02 |  1 |
|  3 |     5 | A2       |        999 |     10,01 |  2 |
+----+-------+----------+------------+-----------+----+

VERSION 1: (Scratch this ugly; I mean seriously; who thinks like this?) I do...

  1. Do something. (version 1)
  2. Laugh(an important and overlooked step)
  3. Then do it right (See version 2 above)
  4. Now make it better. (indexes, tweak joins spelling, layout comments, use the right order by on the row_number logic in CTE2)

DEMO

With CTE (TX_ID,   ORDER_ID, ACCOUNT_ID,  TX_AMOUNT) as (

SELECT 1,       'A1',       200,         -3.00  UNION ALL
SELECT 2,       'A1',       200,         3.00   UNION ALL
SELECT 3,       'A1',       200,         3.00   UNION ALL
SELECT 4,       'A2',       999,         -10.01 UNION ALL
SELECT 5,       'A2',       999,         10.01  UNION ALL
SELECT 6,       'A2',       999,         10.01  UNION ALL
SELECT 7,       'A2',       999,         5.02 ),
cte2 as (
SELECT * 
FROM (Select A.Tx_Id aTx_ID
           , A.order_ID as AOrderID
           , A.Account_ID as AAccount_ID
           , A.tx_Amount as ATx_Amount
           , Row_number() over (partition by Order_ID, Account_ID, tx_Amount order by tx_Amount asc) ARN

      from cte a 
      WHERE tx_Amount <=0) A
FULL OUTER JOIN (SELECT b.tx_Id
                      , b.order_Id
                      , b.Account_Id
                      , b.tx_Amount
                      ,  Row_number() over (partition by Order_ID, Account_ID, tx_Amount order by tx_Amount desc) BRN 
                 FROM  CTE B 
                 WHERE  tx_Amount>0) B
  on A.AOrderID = B.Order_ID
 and A.AAccount_ID = B.Account_ID
 and A.ATx_Amount*-1 = B.tx_Amount
 and A.ARN=B.BRN
Where a.Atx_ID is null
  or B.tx_ID is null)

  Select ATX_ID, AORDERID, AAccount_ID, ATX_AMOUNT from cte2 where ATX_ID is not null
  UNION ALL
  Select TX_ID, ORDER_ID, Account_ID, TX_AMOUNT from cte2 where TX_ID is not null

Upvotes: 2

Related Questions