Reputation: 18898

How do I select rows within a group of rows whose Amounts do not cancel each other out?

Given a number of rows similar to the ones in the following temp table:

TransactionId | AccountsDocumentLineId | Amount
-------------   ----------------------   ------
52345           12345                    -15.79
52346           12345                    15.79
52347           12345                    -15.79
52348           22222                    -6.34
52349           22222                    6.34
52350           22222                    6.34
52351           22222                    -6.34
52352           22222                    -8.76
52353           22222                    10.49

how do I ensure any rows that cancel each other out (sum to zero) with the same AccountsDocumentLineId are removed and only ones that have no more matching rows that they can be paired with are left?

So, in the above example, the row with TransactionId = 52345 (or 52347) would be left (since the -15.79 and 15.79 would cancel each other out).

Similarly, the rows with TransactionId = 52352 and 52353 would remain (since the two pairs of -6.34 and 6.34 would cancel each other out).

So, we would get the following end result:

TransactionId | AccountsDocumentLineId | Amount
-------------   ----------------------   ------
52347           12345                    -15.79
52352           22222                    -8.76
52353           22222                    10.49

Note: I've stripped out all the unnecessary detail (more columns and plenty more rows) to simply show the issue at hand.

Some options I've played with are:-

Summing up all the Amounts within a group of AccountDocumentLineIds and seeing which row matches the balance but that would only deal with the first example (where there's only one row remaining) and not the second one where there are two rows that need to remain (so no easy of splitting one value to give the two rows)
Going through each entry within a group of AccountDocumentLineIds and if a match is found delete both counterparts; I think this one would work although not sure how to go about doing it in SQL?

SQLFiddle Demo

UPDATE: Added complete answer based on Bulat's answer.

Upvotes: 2

Answers (3)

Appulus

Reputation: 18898

In response to Bulat's request for my wrapped code, I'm submitting it here (and for the benefit of others):

DECLARE 
      @SequenceStart INT = 0,
      @SequenceEnd INT = 10 -- Just to be on the safe side, though unlikely for there to be 10 matching pairs for the same AccountsDocumentLineId

STARTLOOP:
    SET
        @SequenceStart = @SequenceStart+1   
    ;
    WITH Matches AS
    (
        SELECT
            L1 = L1.TransactionId, L2 = L2.TransactionId,
            A1 = L1.Amount, A2 = L2.Amount, L1.AccountsDocumentLineId
        FROM
            #RemainingLedgerEntries L1
            INNER JOIN #RemainingLedgerEntries L2
                ON L1.AccountsDocumentLineId = L2.AccountsDocumentLineId
                AND L1.Amount*-1 = L2.Amount
                AND L1.TransactionId < L2.TransactionId
    )
    DELETE FROM
        #RemainingLedgerEntries
    WHERE
        EXISTS
        (
            SELECT
                *
            FROM
                Matches M1
            WHERE
                #RemainingLedgerEntries.TransactionId IN (M1.L1, M1.L2)
            AND NOT EXISTS
            (
                SELECT
                    *
                FROM
                    Matches M2
                WHERE
                    ABS(M1.A1) = ABS(M2.A1) 
                AND M1.AccountsDocumentLineId = M2.AccountsDocumentLineId
                AND
                (
                    (M2.L1 > M1.L1 AND M2.L2 <= M1.L2)  
                    OR (M2.L2 < M1.L2 AND (M2.L1 >= M1.L1))
                    OR M2.L1 = M1.L2
                ) 
            )
        )

    IF @SequenceStart >= @SequenceEnd
        GOTO ENDTASK

    GOTO STARTLOOP

ENDTASK:

Upvotes: 1

Bulat

Reputation: 6979

You can run this code many times (for example while it affects more then 0 records):

WITH Matches AS
(
SELECT t1 = t1. TransactionId, t2 = t2.TransactionId, 
  a1 = t1.Amount, a2 = t2.Amount, t1.AccountsDocumentLineId
FROM  Transactions t1 
    INNER JOIN Transactions t2 
        ON t1.AccountsDocumentLineId = t2.AccountsDocumentLineId
           AND t1.Amount = -t2.Amount 
           AND t1.TransactionId < t2.TransactionId 
)
DELETE FROM Transactions
WHERE EXISTS (
SELECT * FROM Matches m1
WHERE Transactions.TransactionId IN(m1.t1,m1.t2)
  AND NOT EXISTS
 (SELECT * FROM Matches m2
  WHERE  abs(m1.a1) = abs(m2.a1) 
     AND m1.AccountsDocumentLineId = m2.AccountsDocumentLineId
     AND (
         (m2.t1 > m1.t1 AND m2.t2 <= m1.t2)  
      OR (m2.t2 < m1.t2 AND (m2.t1 >= m1.t1))
      OR m2.t1 = m1.t2
     ) 
 )
);

Upvotes: 2

Hugh Jones

Reputation: 2694

The first step is straightforward

DELETE FROM MYTABLE WHERE AccountsDocumentLineId IN
  (SELECT AccountsDocumentLineId from MYTABLE 
   GROUP BY AccountsDocumentLineId 
   HAVING SUM(Amount) <> 0)

But you may find you have some fettling to do to cope with precision errors.

That should leave you with only unbalanced accounts. From there you will have to (I think) create a stored procedure to 'allocate' debits to credits. From your data sample it seems likely that the transaction id is nicely ordered, so that should help.

Upvotes: 1

How do I select rows within a group of rows whose Amounts do not cancel each other out?

Answers (3)

Related Questions