Simon
Simon

Reputation: 398

MySQL "ORDER BY" the amount of rows with the same value for a certain column?

I have a table called trends_points, this table has the following columns:

Now, I'm trying to run a query on this table which will get the rows in a specific time frame ordered by how many times the column term appears in the table during the specific timeframe...So for example if the table has the following rows:

id | userId | term        | time
------------------------------------
1    28       new year      1262231638
2    37       new year      1262231658
3    1        christmas     1262231666
4    34       new year      1262231665
5    12       christmas     1262231667
6    52       twitter       1262231669

I'd like the rows to come out ordered like this:

new year
christmas
twitter

This is because "new year" exists three times in the timeframe, "christmas" exists twice and "twitter" is only in one row.

So far I've asummed it's a simple WHERE for the specific timeframe part of the query and a GROUP BY to stop the same term from coming up twice in the list.

This makes the following query:

SELECT * 
  FROM `trends_points` 
 WHERE ( time >= <time-period_start> 
  AND time <= <time-period_end> ) 
GROUP BY `term`

Does anyone know how I'd do the final part of the query? (Ordering the query's results by how many rows contain the same "term" column value..).

Upvotes: 4

Views: 9354

Answers (4)

OMG Ponies
OMG Ponies

Reputation: 332701

Use:

  SELECT tp.term,
         COUNT(*) 'term_count'
    FROM TREND_POINTS tp
   WHERE tp.time BETWEEN <time-period_start> AND <time-period_end> 
GROUP BY tp.term
ORDER BY term_count DESC, tp.term

See this question about why to use BETWEEN vs using the >=/<= operators.

Keep in mind there can be ties - the order by defaults to alphabetically shorting by term value when this happens, but there could be other criteria.

Also, if you want to additionally limit the number of rows/terms coming back you can add the LIMIT clause to the end of the query. For example, this query will return the top five terms:

  SELECT tp.term,
         COUNT(*) 'term_count'
    FROM TREND_POINTS tp
   WHERE tp.time BETWEEN <time-period_start> AND <time-period_end> 
GROUP BY tp.term
ORDER BY term_count DESC, tp.term
   LIMIT 5

Upvotes: 11

danielrsmith
danielrsmith

Reputation: 4060

SELECT t.term
FROM trend_points t
WHERE t.time >= <time-period_start> AND t.time <= <time-period_end>
ORDER BY COUNT(t.term) DESC
GROUP BY t.term

Upvotes: 2

gahooa
gahooa

Reputation: 137492

Quick answer:

SELECT 
   term, count(*) as thecount
FROM 
   mytable
WHERE
   (...)
GROUP BY 
   term
ORDER BY 
   thecount DESC

Upvotes: 4

Jordan Running
Jordan Running

Reputation: 106077

COUNT() will give you the number of rows in the group, so just order by that.

SELECT * FROM `trends_points`
WHERE ( `time` >= <time-period_start> AND `time` <= <time-period_end> )
ORDER BY COUNT(`term`) DESC
GROUP BY `term`

Upvotes: 1

Related Questions