Reputation: 9037
I have this table named votes
, this table might consist of duplicate records with these columns business_id
and user_id
. I created a view that will display the count of votes per business but unfortunately the duplicates also has been counted which should not be. Below is my view code
(
SELECT
`v`.`business_id` AS `business_id`,
COUNT(`v`.`vote_id`) AS `num_votes`
FROM
`connectn_top100`.`votes` `v`
WHERE
(`v`.`year` = 2019)
GROUP BY
`v`.`business_id`
)
The actual data would be
------------------------------------
id | business_id | user_id | vote_id
------------------------------------
1 | 12 | 12 | 1
------------------------------------
2 | 12 | 12 | 1
------------------------------------
3 | 12 | 12 | 1
------------------------------------
4 | 13 | 15 | 1
------------------------------------
5 | 13 | 15 | 1
------------------------------------
6 | 12 | 16 | 1
------------------------------------
7 | 23 | 16 | 1
------------------------------------
so the result should be
------------------------------------
id | business_id | user_id | vote_id
------------------------------------
1 | 12 | 12 | 1
------------------------------------
4 | 13 | 15 | 1
------------------------------------
6 | 12 | 16 | 1
------------------------------------
7 | 23 | 16 | 1
------------------------------------
so it should be user_id must have unique business_id and business_id must have unique user_id
tried adding another GROUP BY but seems an error.
Upvotes: 0
Views: 56
Reputation: 13509
From your sample data, It seems you merely need below -
SELECT MIN(ID)
,business_id
,user_id
,vote_id
FROM `connectn_top100`.`votes` `v`
GROUP BY business_id
,user_id
,vote_id
This query will generate the output you mentioned.
Upvotes: 0
Reputation: 222402
If you want to count each user_id
just once per business_id
, you can use count(distinct ...)
:
SELECT
`v`.`business_id` AS `business_id`,
COUNT(DISTINCT `v`.`user_id`) AS `num_votes`
FROM `connectn_top100`.`votes` `v`
WHERE `v`.`year` = 2019
GROUP BY `v`.`business_id`
Edit: from your sample data, it looks like you want:
SELECT
`v`.`business_id` AS `business_id`,
`v`.`user_id`,
COUNT(DISTINCT `v`.`vote_id`) AS `num_votes`
FROM `connectn_top100`.`votes` `v`
WHERE `v`.`year` = 2019
GROUP BY `v`.`business_id`, `v`.`user_id`
Upvotes: 1
Reputation: 212
hi you can try this CTE to get your unique values
with cte as ( SELECT
`v`.`business_id` AS `business_id`,
COUNT(`v`.`vote_id`) AS `num_votes`
ROW_NUMBER() OVER (
PARTITION BY
`v`.`business_id`,
v`.`vote_id`
ORDER BY
`v`.`business_id`,
v`.`vote_id`
) row_num
FROM
`connectn_top100`.`votes` `v`
WHERE
(`v`.`year` = 2019)
GROUP BY
`v`.`business_id`
)
select * from cte
where row_num > 1;
Upvotes: 0