Mark Romano
Mark Romano

Reputation: 711

R: Create groups within column

I'm trying to group an age column into an age group column and summarize by that grouping.

ie I need the dataset below -

AGE
1
2
5
68
27
4
2
33
45

To become

AGE_GRP COUNT
1-10     5
11-20    0
21-30    1
31-40    1
40+      2

I'm using R

Thanks.

Upvotes: 0

Views: 886

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269523

You have zero values so you need a left join:

select agegrp, count(t.agegrp)
from (select '1-10' as agegrp, 1 as lowb, 10 as hib union all
      select '11-20' as agegrp, 11, 20 union all
      select '21-30' as agegrp, 21, 30 upperbound union all
      select '31-40' as agegrp, 31, 40 as upperbound union all
      select '40+' as agegrp, 41, NULL as upperbound
     ) ag left join
     t
     on t.age >= ag.lowb and t.age <= ag.hib
group by ag.agegrp
order by ag.lowb;

Note: this assumes the column is an integer, so a value like 30.5 isn't allowed. It is easy to adjust the query to handle non-integer ages, if that is the requirement.

Upvotes: 1

Pரதீப்
Pரதீப்

Reputation: 93694

You need CASE statement to split the AGE into different groups

SELECT CASE
         WHEN AGE BETWEEN 1 AND 10 THEN '1-10'
         WHEN AGE BETWEEN 11 AND 20 THEN '11-20'
         WHEN AGE BETWEEN 21 AND 30 THEN '21-30'
         WHEN AGE BETWEEN 31 AND 40 THEN '31-40'
         ELSE '40+'
       END AS AGE_GRP,
       Count(1) as Cnt
FROM   yourtable
GROUP  BY CASE
            WHEN AGE BETWEEN 1 AND 10 THEN '1-10'
            WHEN AGE BETWEEN 11 AND 20 THEN '11-20'
            WHEN AGE BETWEEN 21 AND 30 THEN '21-30'
            WHEN AGE BETWEEN 31 AND 40 THEN '31-40'
            ELSE '40+'
          END 

If you don't want to repeat the CASE statement in GROUP BY then use this

SELECT AGE_GRP,
       Count(1) AS cnt
FROM   (SELECT CASE
                 WHEN AGE BETWEEN 1 AND 10 THEN '1-10'
                 WHEN AGE BETWEEN 1 AND 10 THEN '11-20'
                 WHEN AGE BETWEEN 1 AND 10 THEN '21-30 '
                 WHEN AGE BETWEEN 1 AND 10 THEN '31-40'
                 ELSE '40+'
               END AS AGE_GRP
        FROM   yourtable) A
GROUP  BY AGE_GRP 

Upvotes: 1

Related Questions