Snowflake - Deleting a CTE because table has duplicate rows

Question

I have tableA that has duplicate rows, I noticed that some columns are different but essentially I'm told that it doesn't matter and we just want to make sure there is a unique ID in each row.

I tried this in Snowflake but I read we can't use CTEs to delete, insert.. etc. So then how?

WITH cte AS (
    SELECT 
        _LOAD_DATETIME, 
        _LOAD_FILENAME, 
        _LOAD_FILE_ROW_NUMBER, 
        ID, 
  CONTENT_CATEGORY,
  CREATED_TIME,
  DESCRIPTION,
  FROM_INFO,
  LENGTH,
  PERMALINK_URL,
  POST_VIEWS,
  PUBLISHED,
  TITLE,
  VIEWS,
  LOADED_DATE,
        ROW_NUMBER() OVER (
            PARTITION BY 
                id, 
                FROM_INFO, 
                title
        ) row_num
     FROM 
        tableA)
DELETE FROM cte
WHERE row_num > 1;

Lukasz Szozda · Accepted Answer

Alternative appraoch could be table recreation:

CREATE OR REPLACE TABLE tableA COPY GRANTS AS
SELECT *
FROM tableA
QUALIFY ROW_NUMBER() OVER (PARTITION BY  id, FROM_INFO,  title ORDER BY 1) = 1;

we just want to make sure there is a unique ID in each row.

If id has to be unique then it should be partitioned only by id column. Before running the query CTaS it is advisable to check the part without CREATE TABLE part.

Snowflake - Deleting a CTE because table has duplicate rows

Answers (1)

Related Questions