Agga
Agga

Reputation: 325

How to calculate 90th Percentile, SD, Mean for data in SQL

Hi I have a table facility. Which holds a score for each day (Multiple scores can be reported each day and both would be valid)

I need to calculate the 90th percentile, SD, and Mean for score by month.

Facility:

Id   Month Date  score
1    Jan     1    5
1    Jan     1    5
1    Jan     2    3
1    Jan     3    4
1    Jan     4    4
1    Jan     5    4
1    Feb     1    5
1    Feb     1    5
1    Feb     2    3
1    Feb     3    4
1    Feb     4    4
1    Feb     5    4

Is there any way?

Thanks for your help.

Upvotes: 27

Views: 127121

Answers (2)

LDMJoe
LDMJoe

Reputation: 1589

Here's the setup...

CREATE TABLE Facility (Id INT NOT NULL, Month nvarchar(3) NOT NULL, Date INT NOT NULL, score INT NOT NULL)

INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 5, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 5, 4)

Now, Standard Deviation and Mean are straight forward enough - there are built in aggregate functions for them...

SELECT
    [Month],
    AVG(CONVERT(real, score)) AS [Mean],
    STDEV(score) AS [Standard Deviation]
FROM
    Facility
GROUP BY
    [Month]

For your 90th percentile, you'll need to invent a function...

CREATE FUNCTION NintythPercentile(@Month nvarchar(3)) RETURNS INT AS
BEGIN
    DECLARE @ReturnValue INT

    SELECT 
        @ReturnValue = MIN(DerivedTopTenPercent.score) --AS [90th Percentile]
    FROM
        (
        SELECT TOP 10 PERCENT
            score
        FROM
            Facility
        WHERE
            [Month] = @Month
        ORDER BY
            score DESC
        ) DerivedTopTenPercent

    RETURN @ReturnValue
END

With that function in place, your final query will look like this...

SELECT
    [Month],
    AVG(CONVERT(real, score)) AS [Mean],
    STDEV(score) AS [Standard Deviation],
    dbo.NintythPercentile([Month]) AS [90th Percentile]
FROM
    Facility
GROUP BY
    [Month]

Upvotes: 3

Code Different
Code Different

Reputation: 93161

You can use the new suite of analytic functions introduced in SQL Server 2012:

SELECT DISTINCT
            [Month],
            Mean   = AVG(Score) OVER (PARTITION BY [Month]),
            StdDev = STDEV(Score) OVER (PARTITION BY [Month]),
            P90    = PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY Score) OVER (PARTITION BY [Month])
FROM        my_table

There are 2 percentile functions: PERCENTILE_CONT for continuous distribution and PERCENTILE_DISC for discrete distribution. Picks one that suits your needs.

Upvotes: 57

Related Questions