Reputation: 1

To find records from Table A having effective dates closest to records in table B

I have two tables as below:

Table A:

ID	Age	Dept	Salary	Start_date
1	30	A	$1000	01-01-2000
1	31	B	$1200	01-01-2022
2	25	C	$1200	01-06-2021
2	26	A	$1300	01-01-2022
3	34	D	$1400	01-01-2021
3	35	C	$1800	01-01-2022

Table B:

ID	Salary	Start_date
1	$1500	01-06-2022
2	$1800	01-01-2022
3	$1600	01-06-2021

I want to insert synthetic new records in Table A, when there is a mismatch in ID and start_date combination between Table A and Table B. The new record should get the Age & Dept from table A for the record which has the closest start date from the missing record in Table B, and salary and start date from table B.

Output:

ID	Age	Dept	Salary	Start_date
1	30	A	$1000	01-01-2000
1	31	B	$1200	01-01-2022
1	31	B	$1500	01-06-2022
2	25	C	$1200	01-06-2021
2	26	A	$1300	01-01-2022
3	34	D	$1400	01-01-2021
3	34	D	$1600	01-06-2021
3	35	C	$1500	01-01-2022

Please help to write SQL queries which will achieve this.

I am facing problems in identifying the rows from Table A which has the closest start_date from the records in Table B.

Upvotes: 0

Answers (2)

d r

Reputation: 7801

If I got it right - just the rows without matching ID, START_DATE combination (in both tables) should be added. That means the row with ID = 2 in table B will not affect the matching row in table A. If that is the case than you need a combination of union all from the tables and Row_Number() and Last_Value() analytic functions along with some Case expressions.

WITH        --  Sample Data
    tbl_a (ID, AGE, DEPT, SALARY, START_DATE) AS
        (   Select 1, 30, 'A', 1000, DATE '2000-01-01' From Dual Union All
            Select 1, 31, 'B', 1200, DATE '2022-01-01' From Dual Union All
            Select 2, 25, 'C', 1200, DATE '2021-06-01' From Dual Union All
            Select 2, 26, 'A', 1300, DATE '2022-01-01' From Dual Union All
            Select 3, 34, 'D', 1400, DATE '2021-01-01' From Dual Union All
            Select 3, 35, 'C', 1800, DATE '2022-01-01' From Dual
        ),
    tbl_b (ID, SALARY, START_DATE) AS
        (   Select 1, 1500, DATE '2022-06-01' From Dual Union All
            Select 2, 1800, DATE '2022-01-01' From Dual Union All
            Select 3, 1600, DATE '2021-06-01' From Dual
        )
--  
--  M a i n    S Q L :
SELECT    ID, 
          Case  When AGE Is Not Null Then AGE
          Else LAST_VALUE(AGE) OVER(Partition By ID Order By ID, START_DATE, TBL Rows Between Unbounded Preceding And 1 Preceding) 
          End "AGE",
          --
          Case  When DEPT Is Not Null Then DEPT 
          Else LAST_VALUE(DEPT) OVER(Partition By ID Order By ID, START_DATE, TBL Rows Between Unbounded Preceding And 1 Preceding)
          End "DEPT",
          --
          SALARY "SALARY", START_DATE "START_DATE"
FROM      ( Select    * 
            From      ( Select    TBL, ID, AGE, DEPT, SALARY, START_DATE, 
                                  Row_Number() OVER(Partition By ID, START_DATE Order By ID, START_DATE, TBL) "RN"
                        From    ( Select        'A' "TBL", ID, AGE, DEPT, SALARY, START_DATE From tbl_a Union All
                                  Select        'B' "TBL", ID, Null, Null, SALARY, START_DATE From tbl_b
                                )
                      ) 
            Where     RN = 1 )
ORDER BY  ID, START_DATE, DEPT

... the resulting dataset should be:

--      ID        AGE DEPT     SALARY START_DATE
--  ------ ---------- ---- ---------- ----------
--       1         30 A          1000 01-JAN-00
--       1         31 B          1200 01-JAN-22
--       1         31 B          1500 01-JUN-22   <-- added row
--       2         25 C          1200 01-JUN-21
--       2         26 A          1300 01-JAN-22
--       3         34 D          1400 01-JAN-21
--       3         34 D          1600 01-JUN-21   <-- added row
--       3         35 C          1800 01-JAN-22

... this could be merged into table A to insert the missing rows:

Merge Into tbl_a a
Using ( ... main sql ... ) x ON(x.ID = a.ID And 
                                x.START_DATE = a.START_DATE) 
When Not Matched
Then  INSERT (a.ID, a.AGE, a.DEPT, a.SALARY, a.START_DATE) 
      Values (x.ID, x.AGE, x.DEPT, x.SALARY, x.START_DATE)

Note: ON clause should be adjusted (probably) to ensure unique row addressing in your table A.

Upvotes: 0

MT0

Reputation: 168096

Use UNION ALL to combine the two tables and then the LAST_VALUE analytic function to get the most recent age and dept values when they are not present:

SELECT id,
       LAST_VALUE(age) IGNORE NULLS
         OVER (PARTITION BY id ORDER BY start_date) AS age,
       LAST_VALUE(dept) IGNORE NULLS
         OVER (PARTITION BY id ORDER BY start_date) AS dept,
       salary,
       start_date
FROM   (
  SELECT id, age, dept, salary, start_date
  FROM   table_a
UNION ALL
  SELECT id, NULL, NULL, salary, start_date
  FROM   table_b
)

Which, for the sample data:

CREATE TABLE table_a (ID, Age, Dept, Salary, Start_date) AS
SELECT 1, 30, 'A', 1000, DATE '2000-01-01' FROM DUAL UNION ALL
SELECT 1, 31, 'B', 1200, DATE '2022-01-01' FROM DUAL UNION ALL
SELECT 2, 25, 'C', 1200, DATE '2021-06-01' FROM DUAL UNION ALL
SELECT 2, 26, 'A', 1300, DATE '2022-01-01' FROM DUAL UNION ALL
SELECT 3, 34, 'D', 1400, DATE '2021-01-01' FROM DUAL UNION ALL
SELECT 3, 35, 'C', 1800, DATE '2022-01-01' FROM DUAL;

CREATE TABLE Table_B (ID, Salary, Start_date) AS
SELECT 1, 1500, DATE '2022-06-01' FROM DUAL UNION ALL
SELECT 2, 1800, DATE '2022-01-01' FROM DUAL UNION ALL
SELECT 3, 1600, DATE '2021-06-01' FROM DUAL;

Outputs:

ID	AGE	DEPT	SALARY	START_DATE
1	30	A	1000	2000-01-01 00:00:00
1	31	B	1200	2022-01-01 00:00:00
1	31	B	1500	2022-06-01 00:00:00
2	25	C	1200	2021-06-01 00:00:00
2	26	A	1300	2022-01-01 00:00:00
2	26	A	1800	2022-01-01 00:00:00
3	34	D	1400	2021-01-01 00:00:00
3	34	D	1600	2021-06-01 00:00:00
3	35	C	1800	2022-01-01 00:00:00

If you want to INSERT the missing rows then, from Oracle 12, you can use a LATERAL join and FETCH FIRST ROW ONLY to find the most recent age and dept values for each row of table_b:

INSERT INTO table_a (id, age, dept, salary, start_date)
SELECT b.id, a.age, a.dept, b.salary, b.start_date
FROM   table_b b
       LEFT OUTER JOIN LATERAL (
         SELECT a.age, a.dept
         FROM   table_a a
         WHERE  a.id = b.id
         AND    a.start_date <= b.start_date
         ORDER BY start_date DESC
         FETCH FIRST ROW ONLY
       ) a
       ON 1 = 1;

fiddle

Upvotes: 1

To find records from Table A having effective dates closest to records in table B

Answers (2)

Related Questions