doing a simple pivot on year

Question

I have a table:

+----+-------+------+
| id | times | year |
+----+-------+------+
|  5 |     2 | 2008 |
|  6 |    76 | 2008 |
|  2 |    43 | 2009 |
|  4 |     5 | 2009 |
|  1 |     3 | 2010 |
|  9 |     6 | 2010 |
|  7 |   444 | 2011 |
|  8 |     3 | 2011 |
|  3 |    65 | 2012 |
+----+-------+------+

I would like to create a pivot out of this table which buckets times per year :

+--------+------+------+------+------+------+
|        | 2008 | 2009 | 2010 | 2011 | 2012 |
+--------+------+------+------+------+------+
| 0      |      |      |      |      |      |
| 1-30   |    1 |    1 |    2 |    1 |      |
| 31-60  |      |    1 |      |      |      |
| 61-90  |    1 |      |      |      |    1 |
| 91-120 |      |      |      |      |      |
| 121+   |      |      |      |    1 |      |
+--------+------+------+------+------+------+

how do i start to tackle this challenge with sql? thank you so much for your guidance.

Taryn · Accepted Answer

You can use the sql server PIVOT function for this. If you know the all of the values for the years as well as the buckets then you can hard-code the query:

select *
from
(
  select 
    case 
      when times = 0 then '0' 
      when times >= 1 and times <=30 then '1-30'
      when times >= 31 and times <=60 then '31-60'  
      when times >= 61 and times <=90 then '61-90' 
      when times >= 91 and times <=120 then '91-120' 
      else '121+' end bucket,
    year
  from yourtable
) src
pivot
(
  count(year)
  for year in ([2008], [2009], [2010], [2011], [2012])
) piv;

See SQL Fiddle with Demo

If you don't have access to the PIVOT function then you can use an aggregate function with a CASE:

select bucket,
  sum(case when year = 2008 then 1 else 0 end) [2008],
  sum(case when year = 2009 then 1 else 0 end) [2009],
  sum(case when year = 2010 then 1 else 0 end) [2010],
  sum(case when year = 2011 then 1 else 0 end) [2011],
  sum(case when year = 2012 then 1 else 0 end) [2012]
from
(
  select 
    case 
      when times = 0 then '0' 
      when times >= 1 and times <=30 then '1-30' 
      when times >= 31 and times <=60 then '31-60'  
      when times >= 61 and times <=90 then '61-90' 
      when times >= 91 and times <=120 then '91-120' 
      else '121+' end bucket,
    year
  from yourtable
) src
group by bucket

See SQL Fiddle with Demo

If you need all of the buckets to be listed, then you will want to have the bucket ranges stored in either a table or using a CTE query, then you can use the following:

;with buckets(startbucket, endbucket, rnk) as
(
  select 0, 0, 1 
  union all
  select 1, 30, 2
  union all
  select 31, 60, 3
  union all
  select 61, 90, 4
  union all
  select 91, 120, 5
  union all
  select 121, null, 6
)
select 
  case when startbucket = 0 then '0'
    when endbucket is null then cast(startbucket as varchar(50)) + '+'
    else cast(startbucket as varchar(50)) + '-'+cast(endbucket as varchar(50)) end buckets,
  [2008], [2009], [2010], [2011], [2012]
from
(
  select rnk,
    year, 
    startbucket, 
    endbucket
  from buckets b
  left join yourtable t
    on t.times >= b.startbucket and t.times <= coalesce(b.endbucket, 100000)
) src
pivot
(
  count(year)
  for year in ([2008], [2009], [2010], [2011], [2012])
) piv;

See SQL Fiddle with Demo

Result:

| BUCKETS | 2008 | 2009 | 2010 | 2011 | 2012 |
----------------------------------------------
|       0 |    0 |    0 |    0 |    0 |    0 |
|    1-30 |    1 |    1 |    2 |    1 |    0 |
|   31-60 |    0 |    1 |    0 |    0 |    0 |
|   61-90 |    1 |    0 |    0 |    0 |    1 |
|  91-120 |    0 |    0 |    0 |    0 |    0 |
|    121+ |    0 |    0 |    0 |    1 |    0 |

The above will work great if you have a known number of values (years) that you need to transpose. If you have an unknown number then you will want to implement dynamic sql, similar to this:

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(year) 
                    from yourtable
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'with buckets(startbucket, endbucket, rnk) as
              (
                select 0, 0, 1 
                union all
                select 1, 30, 2
                union all
                select 31, 60, 3
                union all
                select 61, 90, 4
                union all
                select 91, 120, 5
                union all
                select 121, null, 6
              )
              select 
                case when startbucket = 0 then ''0''
                  when endbucket is null then cast(startbucket as varchar(50)) + ''+''
                  else cast(startbucket as varchar(50)) + ''-''+cast(endbucket as varchar(50)) end buckets,
                '+@cols+'
              from
              (
                select rnk,
                  year, 
                  startbucket, endbucket
                from buckets b
                left join yourtable t
                  on t.times >= b.startbucket and t.times <= coalesce(b.endbucket, 100000)
              ) src
              pivot
              (
                count(year)
                for year in ('+@cols+')
              ) piv;'

execute(@query)

See SQL Fiddle with Demo

The result will be the same for both the static (hard-coded) version and the dynamic version.

doing a simple pivot on year

Answers (2)

Related Questions