duffbeer703
duffbeer703

Reputation: 308

How to select unique rows from a query that returns multiple duplicate records?

I have a scenario where a table contains a hash and a file path where there are files with identical content but the different metadata in different rows. I'd like to generate a list consisting of one file + path per hash, using sqlite SQL only.

Example:

hash    filename    path
abc123  foo.txt     /tmp/
abc123  bar.txt     /tmp/
xyz890  image.png   /home/user4
xyz890  image2.png  /home/user2

Ideal output would be:

abc123 /tmp/foo.txt
xyz890 /tmp/image2.png

What's the best way to do that?

Upvotes: 0

Views: 78

Answers (1)

ekochergin
ekochergin

Reputation: 4129

The output shows you need first record for the first hash and the second line for the second hash. Based on what you've wrote in the beginning I suppose the output would be similar to

abc123 /tmp/foo.txt
xyz890 /tmp/image.png

You can get the result you need using the correlated subquery

select hash, path 
  from your_table t1
 where (t1.path, t1.filename) = (select path, filename 
                                   from your_table t2 
                                  where t2.hash = t1.hash limit 1)

At least it did work against data you've provided us with

Upvotes: 1

Related Questions