Sven
Sven

Reputation: 7635

How to find out which rows match without having multiple results per row?

In SQLite I have a table that is created like this (simplified):

CREATE TABLE [entries] (
    [id]     INTEGER NOT NULL PRIMARY KEY,
    [local]  VARCHAR,
    [remote] VARCHAR,
    [value]  INTEGER
);

INSERT INTO entries (local, remote, value) VALUES ("a", "b", 1); 
INSERT INTO entries (local, remote, value) VALUES ("b", "a", -1);
INSERT INTO entries (local, remote, value) VALUES ("b", "a", -1);

INSERT INTO entries (local, remote, value) VALUES ("a", "d", 2); 
INSERT INTO entries (local, remote, value) VALUES ("a", "d", 2); 
INSERT INTO entries (local, remote, value) VALUES ("d", "a", -2);

Now I want to list combinations that match each other. Consider column local as local bank account and remote as remote bank account. Whenever a transaction is made from local to remote with value x and there exists a matching transaction which received the value x but as negative value (in relation to the first found), I want to have an output from SQLite.

My current approach is like this:

sqlite3 -header demo.db \
    "SELECT * FROM
        (SELECT * FROM entries) AS q1,
        (SELECT * FROM entries) AS q2

     WHERE q1.local = q2.remote AND
           q1.remote = q2.local AND
           q1.value = (q2.value * -1)"

But this returns:

id|local|remote|value|id|local|remote|value
1|a|b|1|2|b|a|-1
1|a|b|1|3|b|a|-1
2|b|a|-1|1|a|b|1
3|b|a|-1|1|a|b|1
4|a|d|2|6|d|a|-2
5|a|d|2|6|d|a|-2
6|d|a|-2|4|a|d|2
6|d|a|-2|5|a|d|2

What I wanted as result would be this:

id|local|remote|value|id|local|remote|value
1|a|b|1|2|b|a|-1
4|a|d|2|6|d|a|-2

Lines which don't have a matching partner should not be displayed - every line could only match to at most one other transaction. I tried with GROUP BY, but this didn't work with q1.id and q2.id as parameter.

Upvotes: 1

Views: 142

Answers (2)

Terje D.
Terje D.

Reputation: 6315

As there might be several rows with the same values for local/remote/value, the rank of the rows within each such set has to be calculated in order to pair the rows correctly.

This is easier using other databases than sqlite (using row numbering primitives, CTEs etc), but in sqlite the correct result may be obtained like this:

SELECT t1.id, t1.local, t1.remote, t1.value,
t2.id, t2.local, t2.remote, t2.value FROM
(SELECT q1.*, count(q1b.id) AS rank
FROM entries q1 LEFT JOIN entries q1b
ON q1.local = q1b.local AND q1.remote = q1b.remote
AND q1.value = q1b.value AND q1.id >= q1b.id
GROUP BY q1.id) AS t1,
(SELECT q2.*, count(q2b.id) AS rank
FROM entries q2 LEFT JOIN entries q2b
ON q2.local = q2b.local AND q2.remote = q2b.remote
AND q2.value = q2b.value AND q2.id >= q2b.id
GROUP BY q2.id) AS t2
WHERE t1.local = t2.remote AND t1.remote = t2.local
AND t1.value = - t2.value
AND t1.id < t2.id AND t1.rank = t2.rank

See http://sqlfiddle.com/#!5/c684e/66/0

Upvotes: 2

CL.
CL.

Reputation: 180070

SELECT *
FROM (SELECT MIN(id) AS id, local, remote, value
      FROM entries
      GROUP BY local, remote, value
     ) AS e1,
     (SELECT MIN(id) AS id, local, remote, value
      FROM entries
      GROUP BY local, remote, value
     ) AS e2
WHERE e1.local = e2.remote
  AND e1.remote = e2.local
  AND e1.value = -e2.value
  AND e1.id < e2.id

Upvotes: 0

Related Questions