es_analytics
es_analytics

Reputation: 13

PIVOT issue in SQL

I am not able to troubleshoot an issue with my pivot function in SQL. I want to preserve the uid column, but have only one row per uid in my result + columns for sc_2015, sc_2016, sc_2017, etc. (In the original table, there is one uid row per year per month + the score column.)

SELECT * FROM 
(
SELECT DISTINCT
        t1.uid,
        t2.score_year AS yr,
        --averaging scores across 12-months per year per uid:
        CAST(AVG(t2.score) OVER (PARTITION BY t1.uid,t2.score_year) AS DECIMAL(4,3)) AS avg_annual_score
FROM
    table1 t1
    INNER JOIN table2 t2
    ON t1.uid
    =
    t2.uid
WHERE t2.score_year IN (
  '2015',
  '2016',
  '2017',
  '2018',
  '2019')
)
PIVOT (
MAX(avg_annual_score) sc FOR yr IN (
  '2015',
  '2016',
  '2017',
  '2018',
  '2019')
);

Upvotes: 0

Views: 133

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521579

I'm going to suggest ditching the PIVOT operator and instead just using an old school pivot query here:

SELECT
    t1.uid,
    AVG(CASE WHEN t2.score_year = 2015 THEN t2.score END) AS avg_score_2015,
    AVG(CASE WHEN t2.score_year = 2016 THEN t2.score END) AS avg_score_2016
    AVG(CASE WHEN t2.score_year = 2017 THEN t2.score END) AS avg_score_2017,
    AVG(CASE WHEN t2.score_year = 2018 THEN t2.score END) AS avg_score_2018
FROM table1 t1
LEFT JOIN table2 t2
    ON t1.uid = t2.uid
GROUP BY
    t1.uid;

The reason I suggest not using PIVOT is that in many cases the above version would actually perform better. And IMHO the code is a bit easier to read and maintain.

Edit:

A possibly write PIVOT version would look like this:

SELECT uid, [2015], [2016], [2017], [2018]
FROM
(
    SELECT t1.uid, t2.score, t2.score_year
    FROM table1 t1
    LEFT JOIN table2 t2
        ON t1.uid = t2.uid
    WHERE t2.score_year BETWEEN 2015 AND 2018
) AS source
PIVOT
(
    AVG(score)
    FOR score_year IN ([2015], [2016], [2017], [2018])
) AS pivotal;

Upvotes: 3

Related Questions