Metaphor
Metaphor

Reputation: 6415

Custom aggregate functions

How does one create a custom aggregate in SQL Server 2012? I'm trying to get a median aggregate.

Something like:

;with cte as 
(
   select top 50 percent val from tbl order by val desc
)
select top 1 from cte

Upvotes: 0

Views: 130

Answers (4)

Steve Kass
Steve Kass

Reputation: 7184

Here’s a kludgy way to turn 宮本 武蔵’s answer into a grouping query. I’ve used Microsoft’s Northwind..Orders table to show the median calculation for the Freight column.

SELECT CustomerID, AVG(1.0 * x.Freight), CASE WHEN COUNT(*)%2=1 THEN COUNT(*) ELSE COUNT(*)/2 END AS Cnt
FROM Orders as O1
CROSS APPLY (
    SELECT Freight FROM Orders as O2
    WHERE O2.CustomerID = O1.CustomerID
     ORDER BY Freight
     OFFSET (SELECT COUNT(*)-1  FROM Orders as O2 WHERE O2.CustomerID = O1.CustomerID) / 2 ROWS
     FETCH NEXT 1 + (1 - (SELECT COUNT(*) FROM Orders as O2 WHERE O2.CustomerID = O1.CustomerID) % 2) ROWS ONLY
) AS x
GROUP BY CustomerID

You may or may not need additional aggregates in your grouping query, and I’ve shown how to get COUNT(*). The CROSS APPLY operation produces one row per CustomerID with an odd number of orders, and it produces two rows per CustomerID with an even number of orders, so to get the COUNT(*) values, you have to divide the COUNT(*) taken after CROSS APPLY by 2 if it is even, and leave it as is if it is odd.

Not sure if this will help, and it bears careful study to be sure it’s correct, but maybe it’s a useful idea.

Upvotes: 0

Chains
Chains

Reputation: 13167

There is an interesting article that examines a number of different approaches to this problem, and also provides a performance analysis. It claims that percentile_cont() is the worst-performing, while this is the best:

DECLARE @c BIGINT = (SELECT COUNT(*) FROM dbo.EvenRows);

SELECT AVG(1.0 * val)
FROM (
    SELECT val FROM dbo.EvenRows
     ORDER BY val
     OFFSET (@c - 1) / 2 ROWS
     FETCH NEXT 1 + (1 - @c % 2) ROWS ONLY
) AS x;

Reference: http://www.sqlperformance.com/2012/08/t-sql-queries/median

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271171

You should just use the built-in function, percentile_cont() or percentil_disc() (see here), depending on how you define the median for an even number of records.

I think the following does what you want:

select percentile_cont(0.5) within group (order by val) as median
from tbl;

This is much easier than defining your own function.

Upvotes: 2

user353gre3
user353gre3

Reputation: 2755

You may want to read the following article.

http://www.sqlperformance.com/2012/08/t-sql-queries/median

Upvotes: 1

Related Questions