Reputation: 305
I have the following type of dataset:
user_id country1 city1 country2 city2
1 usa new york france paris
2 usa dallas japan tokyo
3 india mumbai italy rome
4 france paris usa new york
5 brazil sao paulo russia moscow
I want to group the combinations of country1
, city1
, country2
and city2
where the order (is something country1
or country2
) should not matter. Usually, I would try:
SELECT country1
, city1
, country2
, city2
, COUNT(*)
FROM dataset
GROUP BY country1
, city1
, country2
, city2
However, this code snippet considers the rows with user_id=1
and user_id=4
as two separate cases where I would like them to be considered as equivalent.
Anyone who knows how to coop with this problem?
Thanks in advance!
Upvotes: 1
Views: 75
Reputation: 1269923
Normally, you approach this type of problem using least()
and greatest()
, but you have two columns, instead of one. So, let's do it by comparing cities. I am guessing that city
is more unique than country
:
select (case when city1 < city2 then country1 else country2 end) as country1,
(case when city1 < city2 then city1 else city2 end) as city1,
(case when city1 < city2 then country2 else country1 end) as country2,
(case when city1 < city2 then city2 else city1 end) as city2,
count(*)
from dataset
group by (case when city1 < city2 then country1 else country2 end),
(case when city1 < city2 then city1 else city2 end),
(case when city1 < city2 then country2 else country1 end),
(case when city1 < city2 then city2 else city1 end)
Upvotes: 1