Top N rows by group in ClickHouse

Question

What is the proper way to query top N rows by group in ClickHouse?
Lets take an example of tbl having id2, id4, v3 columns and N=2. I tried the following

SELECT                                                                          
    id2,                                                                        
    id4,                 
    v3 AS v3        
FROM tbl
GROUP BY                 
    id2,                 
    id4                  
ORDER BY v3 DESC                                                                
LIMIT 2 BY                       
    id2,                 
    id4

but getting error

Received exception from server (version 19.3.4):
Code: 215. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception
: Column v3 is not under aggregate function and not in GROUP BY..

I could put v3 into GROUP BY and it does seems to work, but it is not efficient to group by a metric.

There is any aggregate function, but we actually want all values (limited to 2 by LIMIT BY clause) not any value, so it doesn't sound like to be proper solution here.

SELECT                                                                          
    id2,                                                                        
    id4,                 
    any(v3) AS v3        
FROM tbl
GROUP BY                 
    id2,                 
    id4                  
ORDER BY v3 DESC                                                                
LIMIT 2 BY                       
    id2,                 
    id4

vladimir · Accepted Answer

It can be used aggregate functions like this:

SELECT
    id2,
    id4,
    arrayJoin(arraySlice(arrayReverseSort(groupArray(v3)), 1, 2)) v3
FROM tbl
GROUP BY
    id2,
    id4

Top N rows by group in ClickHouse

Answers (2)

Related Questions