Landon Statis
Landon Statis

Reputation: 849

Pivot multiple rows / columns into 1 row

We need to take multiple rows and multiple columns and transpose them into 1 row per key. I have a pivot query, but it is not working. I get some error about "Column ambiguously defined'

Our data looks like this:

SECTOR   TICKER  COMPANY
-----------------------------------------------------
5         ADNT    Adient PLC
5         AUTO    Autobytel Inc.
5         THRM    Gentherm Inc
5         ALSN    Allison Transmission Holdings, Inc.
5         ALV     Autoliv, Inc.
12        HES     Hess Corporation
12        AM      Antero Midstrm
12        PHX     Panhandle Royalty Company
12        NBR     Nabors Industries Ltd.
12        AMRC     Ameresco, Inc.

What we need is 1 row per ID, with each TICKER / COMPANY in a different column. So, output would look like:

5 ADNT   Adient PLC   AUTO  Autobytel Inc.   THRM  Gentherm Inc........

You get the idea. 1 row per ID, and each other value in its own column. The query I tried is:

SELECT sector, ticker, company_name
FROM (SELECT d.sector, d.ticker, v.company_name, ROW_NUMBER() OVER(PARTITION BY d.sector ORDER BY d.sector) rn
           FROM template13_ticker_data d, template13_vw v
           WHERE d.m_ticker = v.m_ticker) 
PIVOT (MAX(sector) AS sector, MAX(ticker) AS ticker, MAX(company_name) AS company_name
 FOR (rn) IN (1 AS sector, 2 AS ticker, 3 AS company_name))
ORDER BY sector;

Upvotes: 3

Views: 12124

Answers (1)

Chris Schaller
Chris Schaller

Reputation: 16767

First thing to understand about pivots, you pick a single column in a result set to act as the as the PIVOT anchor, the hinge that the data will be pivoted around, this is specified in the FOR clause.

You can only PIVOT FOR a single column, but you can construct this column in a subquery or from joins or views as your target data query, OP has used ROW_NUMBER() but you can use any SQL mechanism you wish, even CASE statement to build a custom column to pivot around if there is no natural column within the dataset to use.

PIVOT will make a column for each value in the FOR column and will give that column the value of the aggregation function that you specify

It helps to visualise the constructed record set, before you apply the pivot, the following SQL can recreate the data scenario that OP has presented. I have used table variables here in place of OPs tables and views.

-- template13_ticker_data (with sector_char added)
DECLARE @tickerData Table
(
    sector INT,
    ticker CHAR(4),
    m_ticker CHAR(4),
    sector_char char(10)
)
-- template13_vw
DECLARE @Company Table
(
    m_ticker CHAR(4),
    ticker CHAR(4),
    company_name VARCHAR(100)
)

INSERT INTO @tickerData (sector, ticker)
VALUES (5 ,'ADNT')
,(5 ,'AUTO')
,(5 ,'THRM')
,(5 ,'ALSN')
,(5 ,'ALV')
,(12,'HES')
,(12,'AM')
,(12,'PHX')
,(12,'NBR')
,(12,'AMRC')


INSERT INTO @Company (ticker, company_name)
VALUES ('ADNT','Adient PLC')
,('AUTO','Autobytel Inc.')
,('THRM','Gentherm Inc')
,('ALSN','Allison Transmission Holdings, Inc.')
,('ALV ','Autoliv, Inc.')
,('HES ','Hess Corporation')
,('AM  ','Antero Midstrm')
,('PHX ','Panhandle Royalty Company')
,('NBR ','Nabors Industries Ltd.')
,('AMRC','Ameresco, Inc.')

-- Just re-creating a record set that matches the given data and query structure
UPDATE @tickerData SET m_ticker = ticker
UPDATE @Company SET m_ticker = ticker
-- populate 'sector_char' to show multiple aggregates
UPDATE @tickerData SET sector_char = '|' + cast(sector as varchar) + '|'

-- Unpivoted data Proof
SELECT d.sector, d.sector_char, d.ticker, v.company_name, ROW_NUMBER() OVER(PARTITION BY d.sector ORDER BY d.sector) rn
FROM @tickerData d, @Company v
WHERE d.m_ticker = v.m_ticker

The data before the pivot looks like this:

sector  sector_char ticker  company_name                           rn
------------------------------------------------------------------------
5       |5|         ADNT     Adient PLC                            1
5       |5|         AUTO     Autobytel Inc.                        2
5       |5|         THRM     Gentherm Inc                          3
5       |5|         ALSN     Allison Transmission Holdings, Inc.   4
5       |5|         ALV      Autoliv, Inc.                         5
12      |12|        HES      Hess Corporation                      1
12      |12|        AM       Antero Midstrm                        2
12      |12|        PHX      Panhandle Royalty Company             3
12      |12|        NBR      Nabors Industries Ltd.                4
12      |12|        AMRC     Ameresco, Inc.                        5

Now visualise a subset of the results that you are expecting, to show the limitations around multiple column operations I have created sector_char to include in the final output

sector  sector_char ticker_1    company_1          ticker_2   company_2
-----------------------------------------------------------------------------
5       |5|         ADNT        Adient PLC         AUTO       Autobytel Inc.
12      |12|        HES         Hess Corporation   AM         Antero Midstrm

Because we want more than 1 column output from the original row output, (ticker and company from each row) we have to use one of the following techniques:

  1. Concatenate the values from multiple columns into a single column
    • Only useful if you can easily split those columns before you need to use the individual values, or if you don't need to process the columns, it is purely for visualisations.
  2. execute multiple PIVOT queries and join the results
    • Necessary when the aggregation logic is different for each column, or you are not simply transposing a row value into a column value (aggregating multiple rows into a single cell response.)
    • In scenarios like this one, when we are just transposing the value (eg, the result of the aggregate will match the original cell value) I regard this as a bit of a hack but can also be less syntax than the alternative.

      I say hack because the core PIVOT logic is duplicated, which makes it harder to maintain as the query evolves.

  3. execute a single PIVOT on the unique column, join on other tables to build out the additional columns
    • This easily allows an unlimited number of additional rows in the output. The PIVOT resolves the ID of the table that holds the multiple values that we want to display in the final results.

Lets look at 3 first, as this demonstrates a single PIVOT and how to include multiple columns for each of the PIVOT results:

In this example I have allowed for up to 8 results for each sector, it is important to note that you MUST specify all the output columns from the PIVOT, it is not dynamic.

You could use dynamic queries to test for the max number of columns you need and generate out the following query based on those results.

Also note that in this solution, we do not need to join on the template13_vw table within the PIVOT source query, instead we have joined on the result, that is why the pivot is returning m_ticker (which I assume to be the key) instead of ticker that is displayed in the final result.

-- NOTE: using CTE here, you could use table variables, temporary tables or whatever else you need
;WITH TickersBySector as
(
    -- You must specify the fixed number of columns in the output
    SELECT sector, sector_char, [1] as [m_ticker_1],[2] as [m_ticker_2],[3] as [m_ticker_3],[4] as [m_ticker_4],[5] as [m_ticker_5],[6] as [m_ticker_6],[7] as [m_ticker_7],[8] as [m_ticker_8]
    FROM (
        SELECT d.sector, d.sector_char, d.m_ticker, ROW_NUMBER() OVER(PARTITION BY d.sector ORDER BY d.sector) rn
        FROM template13_ticker_data d /* OPs Syntax */
        -- FROM @tickerData d /* Use this with the proof table variables */
        ) data
    PIVOT (
      MAX(m_ticker)
      FOR rn IN ( [1],[2],[3],[4],[5],[6],[7],[8])
    ) as PivotTable
)
-- To use with the proof table variables, replace 'template13_vw' with '@Company'
SELECT sector, sector_char
    ,c1.[ticker] as [ticker_1], c1.company_name as [company_1] 
    ,c2.[ticker] as [ticker_2], c2.company_name as [company_2] 
    ,c3.[ticker] as [ticker_3], c3.company_name as [company_3] 
    ,c4.[ticker] as [ticker_4], c4.company_name as [company_4] 
    ,c5.[ticker] as [ticker_5], c5.company_name as [company_5] 
    ,c6.[ticker] as [ticker_6], c6.company_name as [company_6] 
    ,c7.[ticker] as [ticker_7], c7.company_name as [company_7] 
    ,c8.[ticker] as [ticker_8], c8.company_name as [company_8]
FROM TickersBySector
LEFT OUTER JOIN template13_vw c1 ON c1.m_ticker = TickersBySector.m_ticker_1
LEFT OUTER JOIN template13_vw c2 ON c2.m_ticker = TickersBySector.m_ticker_2
LEFT OUTER JOIN template13_vw c3 ON c3.m_ticker = TickersBySector.m_ticker_3
LEFT OUTER JOIN template13_vw c4 ON c4.m_ticker = TickersBySector.m_ticker_4
LEFT OUTER JOIN template13_vw c5 ON c5.m_ticker = TickersBySector.m_ticker_5
LEFT OUTER JOIN template13_vw c6 ON c6.m_ticker = TickersBySector.m_ticker_6
LEFT OUTER JOIN template13_vw c7 ON c7.m_ticker = TickersBySector.m_ticker_7
LEFT OUTER JOIN template13_vw c8 ON c8.m_ticker = TickersBySector.m_ticker_8

The following is the same query, using multiple PIVOT queries joins together. Notice that in this scenario it is not important that both PIVOTs bring back the additional common column sector_char, so use this style of syntax when the aggregate or the additional common column might be different for the different result sets.

;WITH TickersBySector as
(
    -- You must specify the fixed number of columns in the output
    SELECT sector, sector_char, [1] as [ticker_1],[2] as [ticker_2],[3] as [ticker_3],[4] as [ticker_4],[5] as [ticker_5],[6] as [ticker_6],[7] as [ticker_7],[8] as [ticker_8]
    FROM (
        SELECT d.sector, d.sector_char, d.m_ticker, ROW_NUMBER() OVER(PARTITION BY d.sector ORDER BY d.sector) rn
        FROM template13_ticker_data d /* OPs Syntax */
        -- FROM @tickerData d /* Use this with the proof table variables */
        ) data
    PIVOT (
      MAX(m_ticker)
      FOR rn IN ( [1],[2],[3],[4],[5],[6],[7],[8])
    ) as PivotTable
)
, CompanyBySector as
(
    -- You must specify the fixed number of columns in the output
    SELECT sector,[1] as [company_1],[2] as [company_2],[3] as [company_3],[4] as [company_4],[5] as [company_5],[6] as [company_6],[7] as [company_7],[8] as [company_8]
    FROM (
        SELECT d.sector, v.company_name, ROW_NUMBER() OVER(PARTITION BY d.sector ORDER BY d.sector) rn
        FROM template13_ticker_data d /* OPs Syntax */
        -- FROM @tickerData d /* Use this with the proof table variables */
        INNER JOIN template13_vw v /* OPs Syntax */
        -- INNER JOIN @Company v /* Use this with the proof table variables */
            ON d.m_ticker = v.m_ticker
        ) data
    PIVOT (
      MAX(company_name)
      FOR rn IN ( [1],[2],[3],[4],[5],[6],[7],[8])
    ) as PivotTable
)
SELECT TickersBySector.sector, sector_char
    ,[ticker_1], [company_1] 
    ,[ticker_2], [company_2] 
    ,[ticker_3], [company_3] 
    ,[ticker_4], [company_4] 
    ,[ticker_5], [company_5] 
    ,[ticker_6], [company_6] 
    ,[ticker_7], [company_7] 
    ,[ticker_8], [company_8]
FROM TickersBySector
INNER JOIN CompanyBySector ON TickersBySector.sector = CompanyBySector.sector

Upvotes: 2

Related Questions