st2 tas
st2 tas

Reputation: 91

How to join two tables with multiple matching records?

I have two tables that I want to join when the ids and dates align. I was thinking something like:

SELECT * from t1 inner join t2 on t1.id = t2.id and t1.date >= t2.date;

But it's possible for multiple entries for each id to exist in both tables and I want to make sure the best records match up -- so if table 2 has entries for 4/10, 4/15, and 4/20 and table 1 has records on 3/15, 4/11, 4/16, and 4/21, then the records would match as follows:

id | t1.date | t2.date
1  |   3/15  | -----  (No match, wouldn't be returned in the results)
1  |   4/11  | 4/10
1  |   4/16  | 4/15
1  |   4/21  | 4/20

Upvotes: 1

Views: 2368

Answers (1)

Leo C
Leo C

Reputation: 22439

One approach would be to left join the tables on id and t1.date >= t2.date, use Window function row_number() over suitable partitions of ordered (t1.date - t2.date), and pick the least date difference which is the first partitioned row:

CREATE TABLE t1 ("id" integer, "date" date, "value" integer);

CREATE TABLE t2 ("id" integer, "date" date);

INSERT INTO t1 VALUES
  (1, '2018-03-15', 10),
  (1, '2018-04-11', 20),
  (1, '2018-04-11', 30),
  (1, '2018-04-16', 30),
  (1, '2018-04-21', 20);

INSERT INTO t2 VALUES
  (1, '2018-04-10'),
  (1, '2018-04-15'),
  (1, '2018-04-20');

WITH q AS (
  SELECT
    t1."id", t1."date" t1_date, t2."date" t2_date, t1."value", row_number() OVER
      (PARTITION BY t1."id", t1."date", t1."value" ORDER BY (t1."date" - t2."date")) row_num
  FROM
    t1 LEFT JOIN t2 ON t1."id" = t2."id" AND t1."date" >= t2."date"
)
SELECT "id", "t1_date", "t2_date", "value" FROM q
WHERE row_num = 1;

--  id |  t1_date   |  t2_date   | value 
-- ----+------------+------------+-------
--   1 | 2018-03-15 |            |    10
--   1 | 2018-04-11 | 2018-04-10 |    20
--   1 | 2018-04-11 | 2018-04-10 |    30
--   1 | 2018-04-16 | 2018-04-15 |    30
--   1 | 2018-04-21 | 2018-04-20 |    20
-- (5 rows)

Note that if you want the result dataset to exclude rows in which t2.date has no match, simply replace LEFT JOIN with INNER JOIN.

Upvotes: 2

Related Questions