Reputation: 1854
I'm using SQL Server 2014
I am trying to remove some noise in a dataset by taking the average of all values in the 90th percentile of a group. Here is the query:
SELECT
DISTINCT EventLocation,
PERCENTILE_CONT(.90)
WITHIN GROUP (ORDER BY (DATEDIFF(MINUTE, StartTime, EndTime)) ASC)
OVER (PARTITION BY EventLocation)
AS 'P90',
AVG(DATEDIFF(MINUTE, StartTime, EndTime))
OVER (PARTITION BY EventLocation)
AS 'Mean'
FROM MyTable
ORDER BY N DESC
Currently there are 2 calculated columns:
PARTITION
population)PARTITION
population)I Want to add another column for:
PARTITION
population) <= the 90th percentile value (of that PARTITION
population)Something like:
AVG(DATEDIFF(MINUTE, StartTime, EndTime))
OVER (PARTITION BY EventLocation)
HAVING (DATEDIFF(MINUTE, StartTime, EndTime) <= [ 90th percentile value ])
AS 'Mean90'
I'm not exactly sure how to approach this since it is referencing the 90th percentile value that was just deifned in P90
...maybe a user-defined function applied group-wise, creating multiple tables and joining them, or something else.
Upvotes: 1
Views: 10937
Reputation: 53
As Gordon said, a CTE is a common way to solve a problem like this. Store the results of your original query in the CTE, then select the content of the CTE and add the work you want done using the column aliases you defined.
;WITH IntermediateResults AS (
SELECT
DISTINCT EventLocation,
PERCENTILE_CONT(.90)
WITHIN GROUP (ORDER BY (DATEDIFF(MINUTE, StartTime, EndTime)) ASC)
OVER (PARTITION BY EventLocation)
AS 'P90',
AVG(DATEDIFF(MINUTE, StartTime, EndTime))
OVER (PARTITION BY EventLocation)
AS 'Mean'
FROM MyTable
ORDER BY N DESC
)
SELECT
*,
AVG(DATEDIFF(MINUTE, StartTime, EndTime))
OVER (PARTITION BY EventLocation)
HAVING (DATEDIFF(MINUTE, StartTime, EndTime) <= P90)
AS 'Mean90'
FROM IntermediateResults
Upvotes: 4