Zack
Zack

Reputation: 179

Is there a way in SQL to return only the items that WOULD have grouped together using a group by multiple columns clause

I have a single SQL table in Redshift:

+-----+--------------------------------------+-------------+------------+
| row |                 ID                   | countrycode |    date    |
+-----+--------------------------------------+-------------+------------+
| 10  | e498bb7e-3084-3e4e-a5e7-9a10cd7a19b8 |     BRA     | 2019-06-05 |
| 11  | ce79f8d8-a473-3414-a867-f453ee7fe91e |     CHL     | 2019-06-12 |
| 12  | aaa425bb-46d6-39f0-be14-b5f962cb1e61 |     CZE     | 2019-06-07 |
| 13  | da16bf46-8f0b-3184-bd65-c5f5b5455392 |     CZE     | 2019-06-07 |
+-----+--------------------------------------+-------------+------------+

And Id like to return only the ID, countrycode and date (though Id settle for returning just the ID) of the items that have the same countrycode AND date as each other i.e if I did a group by without the ID column they would have combined. Referring to the example table above, I'd like to just return the ID's from rows 12 and 13 since they have the same country code AND date as each other.

The goal of the query is to find unique ID's that were in particular country on the same date as each other ie. two unique ID's that were in the same country on the same date, but not a singular ID that spent two days in one country.

Upvotes: 0

Views: 95

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

Is this what you want?

select t.*
from (select t.*, count(*) over (partition by countrycode, date) as cnt
      from t
     ) t
where cnt >= 2;

If your ids can be duplicated -- which is a very strange thing for a column called id -- then you can still use window functions:

select t.*
from (select t.*,
              min(id) over (partition by countrycode, date) as min_id,
              max(id) over (partition by countrycode, date) as max_id
      from t
     ) t
where min_id <> max_id;

If you wanted the ids as a long string, you could also use aggregation:

select countrycode, date, listagg(id, ',') within group (order by id) as ids
from t
group by countrycode, date
having count(*) >= 2;

Upvotes: 1

forpas
forpas

Reputation: 164194

With EXISTS:

select t.*
from tablename t
where exists (
  select 1 from tablename
  where countrycode = t.countrycode and date = t.date and id <> t.id
)

Upvotes: 3

Related Questions