Xarvalus
Xarvalus

Reputation: 3021

Select distinct values from query excluding sorting-purpose column

Description:

I want to select my site content's categories. Most of them will be created by users so I will ve to deal with problem of many categories in table. I want to respect some kind content's trends on my site. My solution is:

Thanks to it I ve all most popular categories from small amount of time + most popular categories in global scope.

Query:

(SELECT category, COUNT(*) AS number FROM data WHERE date BETWEEN ADDDATE(NOW(), INTERVAL -2 DAY) AND NOW() GROUP BY category)
UNION  
(SELECT category, COUNT(*) AS number FROM data WHERE date < ADDDATE(NOW(), INTERVAL -2 DAY) GROUP BY category)
ORDER BY number DESC LIMIT 50

Output:

+----------+--------+
| category | number |
+----------+--------+
|        2 |      3 |
|        4 |      3 |
|        6 |      3 |
|        5 |      2 |
|        1 |      2 |
|        2 |      1 |
+----------+--------+
6 rows in set (0.00 sec)

Note there is duplicated content in category (id 2), UNION DISTINCT (default) is not excluding this because it compares rows from both columns, so:

+----------+--------+
| category | number |
+----------+--------+
|        2 |      3 | //is not equal to
|        2 |      1 | //below values
+----------+--------+
//wont be excluded

Problem to slove:

I need to select distinct values from only category column.

(number is only for sorting purposes and used only in this query)

Upvotes: 1

Views: 803

Answers (1)

fthiella
fthiella

Reputation: 49049

If I understand your question correctly, this should be the query that you need:

SELECT category
FROM (
  SELECT category, COUNT(*) AS number
  FROM data WHERE date BETWEEN ADDDATE(NOW(), INTERVAL -2 DAY) AND NOW()
  GROUP BY category
  UNION ALL
  SELECT category, COUNT(*) AS number
  FROM data WHERE date < ADDDATE(NOW(), INTERVAL -2 DAY)
  GROUP BY category
  ORDER BY number DESC
) s
GROUP BY category
ORDER BY MAX(number) DESC
LIMIT 50

I removed brackets () around your two queries that make your union query because the ORDER BY of your UNION query will be applied to both. I also used UNION ALL instead of UNION because categories are grouped again in the outer query, i would try both UNION/UNION ALL to see which one is faster.

Then I'm grouping again, by category, and ordering by the MAX(number) of your category, and keeping only the first 50 rows.

Upvotes: 1

Related Questions