Josh
Josh

Reputation: 2338

Group by Ranges and Pivot

I'm using the latest version of Clickhouse and trying to do a group/pivot by city and age range. For example, anyone who lives in Chicago with one group being 18-34, then 35-45, etc.

WITH pivot as  
(   
        SELECT city_name , 
            case 
            when age<=34 then '18-34'
            when agebetween 35 and 44 then '35-44'
            when agebetween 45 AND 54 then '45-54'
            else 'Other' end as agg_age  
        FROM people where city_name in ['Chicago','Libertyville']  
    )  
SELECT city_name , groupArray( 
     agg_age  
    ), count(*) FROM pivot 
group by city_name

I'm receiving There is no supertype for types String, UInt8 because some of them are String/FixedString and some of them are not. now since I switched from the Age to a String.It was returning a lot of ages before I grouped it which was expected.

What I want to see is one row with the column headings being the age groups. I can group some in the code so it's not a requirement the column names are the age groups only that the code will have enough to pull it from the results.

Upvotes: 0

Views: 242

Answers (1)

Denny Crane
Denny Crane

Reputation: 13310

create table people (city_name String, age Int16) engine=Memory 
as select ['Chicago','Libertyville'][number%2+1], rand()%101 
from numbers(1e4);

SELECT city_name, groupArray((agg_age, cnt))
from (
  SELECT city_name , 
       case 
       when age<=34 then '18-34'
       when age between 35 and 44 then '35-44'
       when age between 45 AND 54 then '45-54'
       else 'Other' end as agg_age,
       count() cnt
  FROM people where city_name in ['Chicago','Libertyville'] 
  group by city_name, agg_age
  order by city_name, agg_age
) group by city_name
┌─city_name────┬─groupArray(tuple(agg_age, cnt))─────────────────────────────┐
│ Chicago      │ [('18-34',1779),('35-44',487),('45-54',527),('Other',2207)] │
│ Libertyville │ [('18-34',1720),('35-44',527),('45-54',471),('Other',2282)] │
└──────────────┴─────────────────────────────────────────────────────────────┘

SELECT city_name, 
       (groupArray((agg_age, cnt)) as x)[1].2 `18-34`,
       x[2].2 `35-44`,
       x[3].2 `45-54`,
       x[4].2 `Other`
from (
  SELECT city_name , 
       case 
       when age<=34 then '18-34'
       when age between 35 and 44 then '35-44'
       when age between 45 AND 54 then '45-54'
       else 'Other' end as agg_age,
       count() cnt
  FROM people where city_name in ['Chicago','Libertyville'] 
  group by city_name, agg_age
  order by city_name, agg_age
) group by city_name
┌─city_name────┬─18-34─┬─35-44─┬─45-54─┬─Other─┐
│ Chicago      │  1779 │   487 │   527 │  2207 │
│ Libertyville │  1720 │   527 │   471 │  2282 │
└──────────────┴───────┴───────┴───────┴───────┘

Upvotes: 2

Related Questions