John Doe
John Doe

Reputation: 10193

SQL query to find duplicates

I am trying to write a query in SQL server to find out if there are any multiple rows for each hash value.
I need all filenames where the hash value has duplicates.

The result should be (based on my example below)

003B4C68BC143B0290E04432A3A96092    File0003.jpg
003B4C68BC143B0290E04432A3A96092    File0004.jpg
003B4C68BC143B0290E04432A3A96092    File0005.jpg

Please let me know.

Here is the table structure

File table
-----------------------------------------
hash          FileName
---------------------------------------
000341A486F5492877D588BED0806650    File0001.jpg
00363EF2ECEEA32F10176EB64A50283F    File0002.jpg
003B4C68BC143B0290E04432A3A96092    File0003.jpg
003B4C68BC143B0290E04432A3A96092    File0004.jpg
003B4C68BC143B0290E04432A3A96092    File0005.jpg

Upvotes: 8

Views: 47119

Answers (2)

Raphaël Althaus
Raphaël Althaus

Reputation: 60493

select * 
from File 
where hash in (select 
               hash 
               from File
               group by hash
               having count(*) > 1)

Upvotes: 15

John Woo
John Woo

Reputation: 263703

You can use EXISTS to check for duplicates,

SELECT  a.*
FROM    TableName a
WHERE   EXISTS
        (
            SELECT  1
            FROM    Tablename b
            WHERE   a.hash = b.hash
            GROUP   BY hash
            HAVING  COUNT(*) > 1
        )

or INNER JOIN

SELECT  a.*
FROM    [File] a
        INNER JOIN
        (
            SELECT  hash
            FROM    [File] b
            GROUP   BY hash
            HAVING  COUNT(*) > 1
        ) b ON  a.hash = b.hash

Upvotes: 4

Related Questions