QV1
QV1

Reputation: 549

SQL Scalar UDF returns expected results and then returns a consistent weird value

Im using a SQL Scalar UDF to calculate the Weighted Moving Average for a particular stock. I created the following UDF [dbo].[fn_WeightedMovingAverageClosePriceCalculate]. (See below )

However, I get mixed results when calling the function. This is while executing both queries at the same time, but I'm getting different results. I took the code out of the function in a query, plugged in my test values and it works perfectly (WMA13 = 1540.8346). Would love to hear why I'm getting the value of 15.7313 as WMA13 in the second resultset, when both the queries are exactly the same.

enter image description here

ALTER FUNCTION [dbo].[fn_WeightedMovingAverageClosePriceCalculate]
(
@SecCode varchar(100),
@StartDateId int,
@MovingAverageCount int 
)
RETURNS decimal(18,4)
AS
BEGIN

--Generate the Weighting Factor
Declare @WeightingFactor as decimal(18,8)
Set @WeightingFactor  = (@MovingAverageCount*(@MovingAverageCount+1))/2  -- using the formula n(n+1)/2


-- Declare the return variable here
Declare @MovingAverage as decimal (18,4) 
Set @MovingAverage = 0
if @MovingAverageCount <> 0 
begin 
Select @MovingAverage  = SUM(ClosePrice*RowNum/@WeightingFactor)from 
(   
    Select ROW_NUMBER() OVER(order by BusinessDateId asc) AS RowNum , ClosePrice, BusinessDateId
    from 
    (

        Select TOP (@MovingAverageCount) ClosePrice, BusinessDateId 
                from dbo.BhavCopy
                where BusinessDateId <=@StartDateId
                and SecCode = @SecCode
                and Exchange  = 'NSE'
        order by BusinessDateId desc
    )d

)a
end 

Set @WeightingFactor = 0
Set @MovingAverageCount = 0


-- Return the result of the function
Return @MovingAverage

See Data that i'm working with : Data That Im working with

So there were 2 different execution plans, which surprised me as well. Right one - https://drive.google.com/file/d/1vPHbAS3X8Jmua8E5ReUgumsiUuovtL4p/view?usp=sharing

enter image description here

Wrong one - https://drive.google.com/file/d/180-Z3bMtzvV31En6z-zA-sVM_yPNyaQv/view?usp=sharing

enter image description here

Upvotes: 2

Views: 313

Answers (1)

Martin Smith
Martin Smith

Reputation: 453727

This is a bug with scalar UDF inlining and how it treats scalar aggregates in some cases (report).

The issue is that when inlined the execution plan contains a stream aggregate with

ANY(SUM(ClosePrice*CONVERT_IMPLICIT(decimal(19,0),[Expr1007],0)/[Expr1002]))

The nesting of SUM in ANY here is incorrect.

ANY is an internal aggregate that returns the first NOT NULL value that it finds (or NULL if none were found.

So in your case the stream aggregate receives its first row (very likely to be the one with the lowest BusinessDateId out of the 13 eligible) - calculates the SUM(ClosePrice*RowNum/@WeightingFactor) for that row, passes the partial aggregate result to ANY - which considers its work done and uses that as the final result. Any contribution to the SUM from the remaining 12 rows is lost.

You can add with inline = off to the function definition to disable it until the issue is fixed.

A simpler demo is below (tested on SQL Server 2019 RTM and RTM-CU2)

Setup

DROP TABLE IF EXISTS dbo.Numbers

GO

CREATE TABLE dbo.Numbers(Number INT UNIQUE CLUSTERED);

INSERT INTO dbo.Numbers VALUES (NULL), (23), (27), (50);

Demo 1

CREATE OR ALTER FUNCTION [dbo].[fnDemo1]()
RETURNS INT
AS
BEGIN
DECLARE @Result as int, @Zero as int = 0
SELECT @Result = SUM(Number + @Zero) from dbo.Numbers 
RETURN @Result
END

GO

DECLARE @Zero INT = 0
SELECT SUM(Number + @Zero) AS SanityCheck, 
         dbo.fnDemo1() AS FunctionResult
FROM dbo.Numbers 
OPTION (RECOMPILE) --I found the inlining happened more reliably with this

Demo 1 Results

+-------------+----------------+
| SanityCheck | FunctionResult |
+-------------+----------------+
|         100 |             23 |
+-------------+----------------+

enter image description here

All 4 rows were read from the clustered index in key order. After the first one was read the SUM was NULL. After the second one was read the SUM was 23. ANY can then stop and considers its work done. The remaining two rows were still read but don't contribute to the returned ANY(SUM()).

Demo 2

Without the intermediate @Result variable a spurious error is thrown

CREATE OR ALTER FUNCTION [dbo].[fnDemo2]()
RETURNS INT
AS
BEGIN
DECLARE @Zero as int = 0;
RETURN (SELECT SUM(Number + @Zero) from dbo.Numbers);
END

GO

Select dbo.fnDemo2()
OPTION (RECOMPILE) 

enter image description here

Msg 512, Level 16, State 1, Line xx

Subquery returned more than 1 value. This is not permitted when the subquery > > follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

Only one row actually comes out of the stream aggregate but the stream aggregate also calculates a COUNT(*) along with the ANY(SUM()). This is not wrapped in ANY so totals 4 in this case. It is used in the assert operator to give the bogus error that too many rows will be returned.

Demo 3

CREATE OR ALTER FUNCTION [dbo].[fnDemo3]()
RETURNS INT
AS
BEGIN
DECLARE @Zero as int = 0;
RETURN (SELECT SUM(Number + @Zero) from dbo.Numbers GROUP BY ());
END

GO

Select dbo.fnDemo3()
OPTION (RECOMPILE) 

This generates a stack dump and a different error

Msg 8624, Level 16, State 17, Line xx

Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.

Upvotes: 3

Related Questions