Has anybody discovered a (simple) way to limit group_concat rows?

Question

I have a pretty big query to pull out the information about reports from each country, and right now, the only way I can limit this is by putting Limit 10 or some number at the end, which will limit the countries. However, what I want to do is limit the group_concat to 10 results per country, which in my case would be somehow limiting 10 to each instance of the word group_concat.

My current query is:

SELECT country,
GROUP_CONCAT(docID),
GROUP_CONCAT(analyst),
GROUP_CONCAT(region),
GROUP_CONCAT(report),
GROUP_CONCAT(topic),
MAX((date)) AS date,
MAX((docID)) AS docID,
GROUP_CONCAT(date) AS dates,
GROUP_CONCAT(event) AS events,
GROUP_CONCAT(province) AS provinces
FROM reports GROUP BY country 
ORDER BY date DESC, docID DESC

I have seen this question asked, and I haven't seen any really good answers. I know the function is not built into MySQL, as you can only limit based on the characters. Has anybody solved this problem before?

Johan · Accepted Answer

Workaround
One options is to pad your values with ~~spaces~~#. So every item in group_concat is the same length.
Lets suppose there are no items longer than 20 chars.

Then your query would be:

SET group_concat_max_len = 10*20+9; /*execute this first.*/
/*10 items, 20 bytes each + 9 bytes for the separator*/

SELECT country,
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),docID),20)),'#','') AS docIDs, 
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),analyst),20)),'#','') AS analysts, 
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),region,20)),'#','') AS regions, 
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),report,20)),'#','') AS reports, 
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),topic,20)),'#','') AS topics, 
MAX((date)) AS `date`,  /* LATEST DATE*/
MAX((docID)) AS docID,  /* LATEST DOC*/
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),date,20)),'#','') AS dates, 
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),event,20)),'#','') AS events, 
REPLACE(GROUP_CONCAT(RIGHT(CONCAT(repeat('#',20),province,20)),'#','') AS provinces 
FROM reports 
GROUP BY country ORDER BY `date` DESC, docID DESC

Just to recap:

x= CONCAT(repeat('#',20),docID)  adds 20 x #################### in front of docID
y= RIGHT(X,20)                   Takes the rightmost 20 chars of that
z= GROUP_CONCAT(y)               strings these together up to max_len
result = REPLACE(z,'#','') removes the `#` so the result looks normal.

Or write your own version of group_concat
You can use your own group_concat UDF.
There are a couple of examples floating around on the net.

E.g.: http://www.codeproject.com/KB/database/mygroupconcat.aspx?display=Mobile

Has anybody discovered a (simple) way to limit group_concat rows?

Answers (2)

Related Questions