Reputation: 91
I have two tables that I want to join when the ids and dates align. I was thinking something like:
SELECT * from t1 inner join t2 on t1.id = t2.id and t1.date >= t2.date;
But it's possible for multiple entries for each id to exist in both tables and I want to make sure the best records match up -- so if table 2 has entries for 4/10, 4/15, and 4/20 and table 1 has records on 3/15, 4/11, 4/16, and 4/21, then the records would match as follows:
id | t1.date | t2.date
1 | 3/15 | ----- (No match, wouldn't be returned in the results)
1 | 4/11 | 4/10
1 | 4/16 | 4/15
1 | 4/21 | 4/20
Upvotes: 1
Views: 2368
Reputation: 22439
One approach would be to left join
the tables on id
and t1.date >= t2.date
, use Window function row_number()
over suitable partitions of ordered (t1.date - t2.date), and pick the least date difference which is the first partitioned row:
CREATE TABLE t1 ("id" integer, "date" date, "value" integer);
CREATE TABLE t2 ("id" integer, "date" date);
INSERT INTO t1 VALUES
(1, '2018-03-15', 10),
(1, '2018-04-11', 20),
(1, '2018-04-11', 30),
(1, '2018-04-16', 30),
(1, '2018-04-21', 20);
INSERT INTO t2 VALUES
(1, '2018-04-10'),
(1, '2018-04-15'),
(1, '2018-04-20');
WITH q AS (
SELECT
t1."id", t1."date" t1_date, t2."date" t2_date, t1."value", row_number() OVER
(PARTITION BY t1."id", t1."date", t1."value" ORDER BY (t1."date" - t2."date")) row_num
FROM
t1 LEFT JOIN t2 ON t1."id" = t2."id" AND t1."date" >= t2."date"
)
SELECT "id", "t1_date", "t2_date", "value" FROM q
WHERE row_num = 1;
-- id | t1_date | t2_date | value
-- ----+------------+------------+-------
-- 1 | 2018-03-15 | | 10
-- 1 | 2018-04-11 | 2018-04-10 | 20
-- 1 | 2018-04-11 | 2018-04-10 | 30
-- 1 | 2018-04-16 | 2018-04-15 | 30
-- 1 | 2018-04-21 | 2018-04-20 | 20
-- (5 rows)
Note that if you want the result dataset to exclude rows in which t2.date
has no match, simply replace LEFT JOIN
with INNER JOIN
.
Upvotes: 2