Reputation: 11
Just wondering anyone knows the formula behind STDEV() (for any version of sql server), or why standard deviation for values such as these decimals are different.
My goal is to explain this difference in calculation to our QA department who can then explain to our customer.
SQL function STDEV() for decimals doesn't seem to EVER match excel, non decimal integers seem to match no problem.
I am trying to calculate a deviation for a sample, not a population.
Example values: 99.99991, 99.99992
Excel STDEV (or any online standard deviation calculator): 7.07107E-06
SQL: 7.13664510111607E-06
I have tried multiple versions of SQL server, and all variations of this function in SQL and Excel 2007 and 2019, I can never get them to match with these values.
SQL is obviously doing something slightly different compared to calculations on this website:
https://www.mathsisfun.com/data/standard-deviation-calculator.html
tSQL to reproduce:
declare @table table (theNumber float)
insert into @table (theNumber) Values (99.99991), (99.99992)
select STDEV(theNumber) from @table
Any help would be much appreciated!
Thanks!
Edit: Here is a function to use if anyone else runs into this issue:
CREATE FUNCTION[dbo].[StandardDevationSample](@Values as TVP_FLOAT READONLY) RETURNS DECIMAL(12,6)
AS
BEGIN
DECLARE @ItemCount INT,
@Sum FLOAT,
@Mean FLOAT,
@SumOfDifferencesSquared FLOAT,
@Variance FLOAT
-------------------------------------------------------------------------------
DECLARE @Differences TABLE
(
[Value] FLOAT
)
DECLARE @DifferencesSquared TABLE
(
[Value] FLOAT
)
SELECT @ItemCount = (SELECT Count(ID) FROM @Values)
IF(@ItemCount <= 1)
RETURN 0
SELECT @Sum = (SELECT Sum(ID) FROM @Values)
SELECT @Mean = (@Sum / @ItemCount)
INSERT INTO @Differences ([value])
SELECT (ID - @Mean)
FROM @Values
INSERT INTO @DifferencesSquared ([value])
SELECT Square([value])
FROM @Differences
SELECT @SumOfDifferencesSquared = Sum([value])
FROM @DifferencesSquared
SELECT @Variance = @SumOfDifferencesSquared / (@ItemCount - 1)
RETURN Convert(Decimal(18,9), Sqrt(@Variance))
RETURN 0
END
GO
Upvotes: 1
Views: 710
Reputation: 239684
If the inputs are only known to 5 decimal places, then using any decimals in the standard deviation result is questionable at best.
Consider:
declare @table table (groupNo int, theNumber float)
insert into @table (groupNo, theNumber) Values
(1, 99.9999051), (1,99.9999249),
(2, 99.99991), (2,99.99992),
(3, 99.9999149), (3,99.9999151)
select groupNo,ROUND(theNumber,5) from @table
select groupNo,STDEV(theNumber) from @table group by groupNo
The first result set is this:
groupNo (No column name)
1 99.99991
1 99.99992
2 99.99991
2 99.99992
3 99.99991
3 99.99992
The second result set is this:
groupNo (No column name)
1 1.40160927359572E-05
2 7.13664510111607E-06
3 1.9073486328125E-06
So I'd suggest the most you should present to your users for this calculation would be 7E-06
, and even that is sketchy, and at which point both SQL and Excel are in agreement.
Upvotes: 0