Reputation: 3
I have a table stored in Snowflake consisting of employee information. Each employee has a unique ID, and there are 2 rows for each unique ID. Row1 for a unique ID represents new/current personal details, row2 represents details from an old system. There are around 70 columns (e.g. title, name, date of birth, address, postcode, email etc...) and there are lots of employees.
I need to compare all columns between the two rows for each employee, and output a single row with all column names stating false/0 if the rows in that column were the same, and outputting true/1 if they were different. Except for ID column (leaving as a primary key).
For example:
MyTable:
ID | FIRSTNAME | Salary |
---|---|---|
123 | Jack | 20000 |
123 | Jack | 25000 |
Output:
ID | FIRSTNAME | Salary |
---|---|---|
123 | 0 | 1 |
I have come up with a solution but it requires me to manually enter every column name, I'm sure there is a better way. (If the SQL gets overly complex I can also use Python)
My Solution:
SELECT
CURRENT.ID
, CASE WHEN CURRENT.FIRSTNAME = PREVIOUS.FIRSTNAME THEN FALSE ELSE TRUE END AS FIRSTNAME
-- ETC... FOR ALL COLUMNS
FROM MYTABLE CURRENT
INNER JOIN MYTABLE PREVIOUS ON (CURRENT.ID = PREVIOUS.ID)
WHERE CURRENT.VERSION = 'CURRENT' AND PREVIOUS.VERSION = 'PREVIOUS'
Upvotes: 0
Views: 1474
Reputation: 26
You can generate the select statement in your solution using metadata tables.
Upvotes: 1