Reputation: 15734
I have two tables being joined. They have a one-to-many relationship. I would like to make a query that pulls the most frequent of the "many" table for each of the parent id's.
I have a query like this that is in the beginning phases of what I am trying to do:
SELECT p.profile, up.value, COUNT(*) AS theCount FROM `profiles` AS p
JOIN user_profile AS up ON p.id = up.profile_id
GROUP BY `profile`, `value`
ORDER BY p.profile ASC, theCount DESC;
Now the problem with this query is that it will show all values from the many table, and just group them with most to least. For example:
If table profile
has values of val 1
,val 2
and table user_profiles
has values of Apple
, Apple
, Orange
with a parent_id of val 1
, and Pear
for val 2
in the first table, the query above will group them like this:
val 1
, Apple
, 2
val 1
, Orange
, 1
val 2
, Pear
, 1
Now what I WANT is this:
val 1
, Apple
, 2
val 2
, Pear
, 1
I only want to show the highest value for the parent. Or if the case there is only one value (e.g. val 2) then show that.
Now, I can easily do this in a subquery (Do a Limit 1 for each parent value; order by theCount
DESC). However, is there (1) a way to do this without a subquery? or (2) a MySQL function or some other way to do this? I am trying hard to keep these queries very quick and high performance.
If you can show me that a subquery will not hinder performance here, I will select that as an answer as well.
Upvotes: 1
Views: 65
Reputation: 1270713
With this type of problem, I think the group_concat()
/substring_index()
trick is the easiest method in MySQL:
SELECT profile, substring_index(group_concat(up.value order by theCount desc), ',', 1) as MostCommonValue,
theCount
FROM (SELECT p.profile, up.value, COUNT(*) AS theCount
FROM `profiles` p JOIN
user_profile up
ON p.id = up.profile_id
GROUP BY `profile`, `value`
) pv
GROUP BY profile
ORDER BY p.profile ASC, theCount DESC;
This method has short-comings. For instance, if value
can contain a comma, you need a different separator. More importantly, there is a maximum length to the group_concat()
intermediate value (but you set the parameter to have a larger value). But, it works quite well in many cases.
Upvotes: 2