Reputation: 560
I am using the following query to find duplicate zip
values in my dataset.
This does work to display the Country, City and Street of any duplicate zip values, but I really want it to only include duplicates if they also have the same country, city and street and not just the zip value?
SELECT
Country,
City,
Street,
zip
FROM
project.dataset.tablename
WHERE
zip > 1
AND CAST(zip AS string) IN (
SELECT
CAST(zip AS string)
FROM
project.dataset.tablename
GROUP BY
CAST(zip AS string)
HAVING
COUNT(CAST(zip AS string)) > 1 )
ORDER BY
zip DESC
Upvotes: 0
Views: 34
Reputation: 1269773
I think you want:
SELECT t.*
FROM (SELECT t.*,
COUNT(*) OVER (PARTITION BY zip, country, city, street) as cnt
FROM project.dataset.tablename t
) t
WHERE cnt > 1
ORDER BY zip;
In any case, for this type of question, window functions usually provide the best solution.
Upvotes: 3