AVal
AVal

Reputation: 617

snowflake, get a list of mismatching columns between two tables (SQL)

I have been doing some reseach but didn't find much. I need to compare two tables to get a list of which columns are in table 1, but not in table 2. I am using Snowflake. Now, I've found this answer: postgresql - get a list of columns difference between 2 tables

The problem is that when I run the code I get this error:

SQL compilation error: invalid identifier TRANSIENT_STAGE_TABLE

The code works fine if I run it separately, so if I run:

SELECT column_name
FROM information_schema.columns 
WHERE table_schema = 'your_schema' AND table_name = 'table2'

I actually get a list of column names, but when I chain it to the second expression, the above error is returned. Any hint on what's going on? Thank you

Upvotes: 1

Views: 16843

Answers (1)

Marcin Zukowski
Marcin Zukowski

Reputation: 4739

The query from the original post should work, maybe you're missing single quotes somewhere? See this example

create or replace table xxx1(i int, j int);
create or replace table xxx2(i int, k int);

-- Query from the original post
SELECT column_name
FROM information_schema.columns 
WHERE table_name = 'XXX1'
    AND column_name NOT IN
    (
        SELECT column_name
        FROM information_schema.columns 
        WHERE table_name = 'XXX2'
    );
-------------+
 COLUMN_NAME |
-------------+
 J           |
-------------+

You can also write a slightly more complex query to see all columns not matching, from both tables:

with 
s1 as (
  select table_name, column_name 
  from information_schema.columns 
  where table_name = 'XXX1'), 
s2 as (
  select table_name, column_name 
  from information_schema.columns 
  where table_name = 'XXX2') 
select * from s1 full outer join s2 on s1.column_name = s2.column_name;
------------+-------------+------------+-------------+
 TABLE_NAME | COLUMN_NAME | TABLE_NAME | COLUMN_NAME |
------------+-------------+------------+-------------+
 XXX1       | I           | XXX2       | I           |
 XXX1       | J           | [NULL]     | [NULL]      |
 [NULL]     | [NULL]      | XXX2       | K           |
------------+-------------+------------+-------------+

You can add WHERE s1.column_name IS NULL or s2.column_name IS NULL to find only missing columns of course.

You can also easily extend it to detect column type differences.

Upvotes: 7

Related Questions