Reputation: 549
Im using a SQL Scalar UDF to calculate the Weighted Moving Average for a particular stock. I created the following UDF [dbo].[fn_WeightedMovingAverageClosePriceCalculate]. (See below )
However, I get mixed results when calling the function. This is while executing both queries at the same time, but I'm getting different results. I took the code out of the function in a query, plugged in my test values and it works perfectly (WMA13 = 1540.8346). Would love to hear why I'm getting the value of 15.7313 as WMA13 in the second resultset, when both the queries are exactly the same.
ALTER FUNCTION [dbo].[fn_WeightedMovingAverageClosePriceCalculate]
(
@SecCode varchar(100),
@StartDateId int,
@MovingAverageCount int
)
RETURNS decimal(18,4)
AS
BEGIN
--Generate the Weighting Factor
Declare @WeightingFactor as decimal(18,8)
Set @WeightingFactor = (@MovingAverageCount*(@MovingAverageCount+1))/2 -- using the formula n(n+1)/2
-- Declare the return variable here
Declare @MovingAverage as decimal (18,4)
Set @MovingAverage = 0
if @MovingAverageCount <> 0
begin
Select @MovingAverage = SUM(ClosePrice*RowNum/@WeightingFactor)from
(
Select ROW_NUMBER() OVER(order by BusinessDateId asc) AS RowNum , ClosePrice, BusinessDateId
from
(
Select TOP (@MovingAverageCount) ClosePrice, BusinessDateId
from dbo.BhavCopy
where BusinessDateId <=@StartDateId
and SecCode = @SecCode
and Exchange = 'NSE'
order by BusinessDateId desc
)d
)a
end
Set @WeightingFactor = 0
Set @MovingAverageCount = 0
-- Return the result of the function
Return @MovingAverage
See Data that i'm working with :
So there were 2 different execution plans, which surprised me as well. Right one - https://drive.google.com/file/d/1vPHbAS3X8Jmua8E5ReUgumsiUuovtL4p/view?usp=sharing
Wrong one - https://drive.google.com/file/d/180-Z3bMtzvV31En6z-zA-sVM_yPNyaQv/view?usp=sharing
Upvotes: 2
Views: 313
Reputation: 453727
This is a bug with scalar UDF inlining and how it treats scalar aggregates in some cases (report).
The issue is that when inlined the execution plan contains a stream aggregate with
ANY(SUM(ClosePrice*CONVERT_IMPLICIT(decimal(19,0),[Expr1007],0)/[Expr1002]))
The nesting of SUM
in ANY
here is incorrect.
ANY
is an internal aggregate that returns the first NOT NULL
value that it finds (or NULL
if none were found.
So in your case the stream aggregate receives its first row (very likely to be the one with the lowest BusinessDateId
out of the 13 eligible) - calculates the SUM(ClosePrice*RowNum/@WeightingFactor)
for that row, passes the partial aggregate result to ANY
- which considers its work done and uses that as the final result. Any contribution to the SUM
from the remaining 12 rows is lost.
You can add with inline = off
to the function definition to disable it until the issue is fixed.
A simpler demo is below (tested on SQL Server 2019 RTM and RTM-CU2)
DROP TABLE IF EXISTS dbo.Numbers
GO
CREATE TABLE dbo.Numbers(Number INT UNIQUE CLUSTERED);
INSERT INTO dbo.Numbers VALUES (NULL), (23), (27), (50);
CREATE OR ALTER FUNCTION [dbo].[fnDemo1]()
RETURNS INT
AS
BEGIN
DECLARE @Result as int, @Zero as int = 0
SELECT @Result = SUM(Number + @Zero) from dbo.Numbers
RETURN @Result
END
GO
DECLARE @Zero INT = 0
SELECT SUM(Number + @Zero) AS SanityCheck,
dbo.fnDemo1() AS FunctionResult
FROM dbo.Numbers
OPTION (RECOMPILE) --I found the inlining happened more reliably with this
Demo 1 Results
+-------------+----------------+
| SanityCheck | FunctionResult |
+-------------+----------------+
| 100 | 23 |
+-------------+----------------+
All 4 rows were read from the clustered index in key order. After the first one was read the SUM
was NULL
. After the second one was read the SUM
was 23
. ANY
can then stop and considers its work done. The remaining two rows were still read but don't contribute to the returned ANY(SUM())
.
Without the intermediate @Result
variable a spurious error is thrown
CREATE OR ALTER FUNCTION [dbo].[fnDemo2]()
RETURNS INT
AS
BEGIN
DECLARE @Zero as int = 0;
RETURN (SELECT SUM(Number + @Zero) from dbo.Numbers);
END
GO
Select dbo.fnDemo2()
OPTION (RECOMPILE)
Msg 512, Level 16, State 1, Line xx
Subquery returned more than 1 value. This is not permitted when the subquery > > follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Only one row actually comes out of the stream aggregate but the stream aggregate also calculates a COUNT(*)
along with the ANY(SUM())
. This is not wrapped in ANY
so totals 4
in this case. It is used in the assert operator to give the bogus error that too many rows will be returned.
CREATE OR ALTER FUNCTION [dbo].[fnDemo3]()
RETURNS INT
AS
BEGIN
DECLARE @Zero as int = 0;
RETURN (SELECT SUM(Number + @Zero) from dbo.Numbers GROUP BY ());
END
GO
Select dbo.fnDemo3()
OPTION (RECOMPILE)
This generates a stack dump and a different error
Msg 8624, Level 16, State 17, Line xx
Internal Query Processor Error: The query processor could not produce a query plan. For more information, contact Customer Support Services.
Upvotes: 3