Reputation: 559
I'm having a very strange behavior this day. I'm experimenting a bit with SQL Server (2017) in-memory tables and column store indexes. But after creating a very small database (2 tables only) with very small data inside, i noticed that one of my queries returns wrong data.
Some background:
First table [Attribute] contains some hierarchical data (Id, ParentId). This table has a trigger which calculates some columns for Nested Set Model.
Based on this I created a view [vw_Attribute_Hierarchy] which returns a list of all attributes with their children nodes. I want to use this to summarize my data for each hierarchical node (so children values are included).
Second table [FactImpacts] contains only some data to summarize on columns [PlanValue] and [ActualValue].
I tried to create a schema with only the needed columns / data. But it seems to be that query results are dependent of the given data set.
This is currently my smallest script (please execute in SQL CMD mode) to reproduce my issue and it should return two results with THREE rows in each, but i get only one for the first query and three for the second query.
Script for creating the database and some content:
SET ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;
GO
:setvar DatabaseName "NEW-DB-823474692347"
:setvar DefaultFilePrefix "NEW-DB-823474692347"
:setvar DefaultDataPath "C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017\MSSQL\DATA\"
:setvar DefaultLogPath "C:\Program Files\Microsoft SQL Server\MSSQL14.SQL2017\MSSQL\DATA\"
GO
:on error exit
GO
/*
Detect SQLCMD mode and disable script execution if SQLCMD mode is not supported.
To re-enable the script after enabling SQLCMD mode, execute the following:
SET NOEXEC OFF;
*/
:setvar __IsSqlCmdEnabled "True"
GO
IF N'$(__IsSqlCmdEnabled)' NOT LIKE N'True'
BEGIN
PRINT N'SQLCMD mode must be enabled to successfully execute this script.';
SET NOEXEC ON;
END
GO
USE [master];
GO
IF (DB_ID(N'$(DatabaseName)') IS NOT NULL)
BEGIN
ALTER DATABASE [$(DatabaseName)]
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE [$(DatabaseName)];
END
GO
PRINT N'Creating $(DatabaseName)...'
GO
CREATE DATABASE [$(DatabaseName)]
ON
PRIMARY(NAME = [$(DatabaseName)], FILENAME = N'$(DefaultDataPath)$(DefaultFilePrefix)_Primary.mdf')
LOG ON (NAME = [$(DatabaseName)_log], FILENAME = N'$(DefaultLogPath)$(DefaultFilePrefix)_Primary.ldf') COLLATE Latin1_General_CI_AS
GO
IF EXISTS (SELECT 1
FROM [master].[dbo].[sysdatabases]
WHERE [name] = N'$(DatabaseName)')
BEGIN
ALTER DATABASE [$(DatabaseName)]
SET AUTO_CLOSE OFF
WITH ROLLBACK IMMEDIATE;
END
GO
PRINT N'Creating [FG_MemoryOptimizedData]...';
GO
ALTER DATABASE [$(DatabaseName)]
ADD FILEGROUP [FG_MemoryOptimizedData] CONTAINS MEMORY_OPTIMIZED_DATA;
GO
ALTER DATABASE [$(DatabaseName)]
ADD FILE (NAME = [FG_MemoryOptimizedData_76BAE8A0], FILENAME = N'$(DefaultDataPath)$(DefaultFilePrefix)_FG_MemoryOptimizedData_76BAE8A0.mdf') TO FILEGROUP [FG_MemoryOptimizedData];
GO
USE [$(DatabaseName)];
GO
PRINT N'Creating [dbo].[Attribute]...';
GO
CREATE TABLE [dbo].[Attribute] (
[TenantId] UNIQUEIDENTIFIER NOT NULL,
[AttributeId] UNIQUEIDENTIFIER NOT NULL,
[AttributeParentId] UNIQUEIDENTIFIER NULL,
[AttributeName] NVARCHAR (128) NOT NULL,
[LeftExtent] INT NOT NULL,
[RightExtent] INT NOT NULL,
[HierarchyLevel] INT NOT NULL,
[NodeCount] INT NOT NULL,
CONSTRAINT [PK_Attribute] PRIMARY KEY NONCLUSTERED ([AttributeId] ASC)
);
GO
PRINT N'Creating [dbo].[Attribute].[IX_Attribute_Rn]...';
GO
CREATE UNIQUE CLUSTERED INDEX [IX_Attribute_Rn]
ON [dbo].[Attribute]([TenantId] ASC, [AttributeName] ASC);
GO
PRINT N'Creating [dbo].[Attribute].[IX_Attribute_Name]...';
GO
CREATE UNIQUE NONCLUSTERED INDEX [IX_Attribute_Name]
ON [dbo].[Attribute]([AttributeName] ASC);
GO
PRINT N'Creating [dbo].[FactImpact]...';
GO
CREATE TABLE [dbo].[FactImpact] (
[Rn] BIGINT IDENTITY (1, 1) NOT NULL,
[TenantId] UNIQUEIDENTIFIER CONSTRAINT [DF_FactImpact_TenantId] DEFAULT '10000000-0000-0000-0000-000000000000' NOT NULL,
[ImpactId] UNIQUEIDENTIFIER CONSTRAINT [DF_FactImpact_ImpactId] DEFAULT NEWID() NOT NULL,
[AttributeId] UNIQUEIDENTIFIER NOT NULL,
[Date] DATE NOT NULL,
[PlanValue] DECIMAL (38, 10) NULL,
[ActualValue] DECIMAL (38, 10) NULL,
CONSTRAINT [PK_FactImpact_Rn] PRIMARY KEY NONCLUSTERED HASH ([Rn]) WITH (BUCKET_COUNT = 100000),
INDEX [COLIX_FactImpact] CLUSTERED COLUMNSTORE
)
WITH (MEMORY_OPTIMIZED = ON);
GO
PRINT N'Creating [dbo].[DF_Attribute_TenantId]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [DF_Attribute_TenantId] DEFAULT '10000000-0000-0000-0000-000000000000' FOR [TenantId];
GO
PRINT N'Creating [dbo].[DF_Attribute_Id]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [DF_Attribute_Id] DEFAULT NEWSEQUENTIALID() FOR [AttributeId];
GO
PRINT N'Creating [dbo].[DF_Attribute_LeftExtend]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [DF_Attribute_LeftExtend] DEFAULT 0 FOR [LeftExtent];
GO
PRINT N'Creating [dbo].[DF_Attribute_RightExtend]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [DF_Attribute_RightExtend] DEFAULT 0 FOR [RightExtent];
GO
PRINT N'Creating [dbo].[DF_Attribute_HierarchyLevel]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [DF_Attribute_HierarchyLevel] DEFAULT 0 FOR [HierarchyLevel];
GO
PRINT N'Creating [dbo].[DF_Attribute_ChildCount]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [DF_Attribute_ChildCount] DEFAULT 0 FOR [NodeCount];
GO
PRINT N'Creating [dbo].[FK_Attribute_ParentId]...';
GO
ALTER TABLE [dbo].[Attribute]
ADD CONSTRAINT [FK_Attribute_ParentId] FOREIGN KEY ([AttributeParentId]) REFERENCES [dbo].[Attribute] ([AttributeId]);
GO
PRINT N'Creating [dbo].[vw_Attribute_Hierarchy]...';
GO
CREATE VIEW [dbo].[vw_Attribute_Hierarchy]
AS
SELECT
[Root].[TenantId],
[Root].[AttributeId] AS [AttributeRootId],
[Root].[AttributeName] AS [AttributeRootName],
[Node].[AttributeId] AS [AttributeNodeId],
[Node].[AttributeName] AS [AttributeNodeName]
FROM [dbo].[Attribute] [Root]
JOIN [dbo].[Attribute] [Node] ON [Node].[LeftExtent] BETWEEN [Root].[LeftExtent] AND [Root].[RightExtent]
GO
PRINT N'Creating [dbo].[iusp_CalculateNestedSets]...';
GO
CREATE PROCEDURE [dbo].[iusp_CalculateNestedSets]
(
@SchemaName SYSNAME,
@TableName SYSNAME,
@GroupColumnName SYSNAME,
@IdColumnName SYSNAME = 'Id',
@ParentIdColumnName SYSNAME = 'ParentId'
)
AS
-- http://www.sqlservercentral.com/articles/Hierarchy/94040/
SET NOCOUNT ON
CREATE TABLE [#HTally]([N] INT NOT NULL PRIMARY KEY)
INSERT INTO [#HTally]([N])
SELECT TOP 500 --(16 * 500 = VARBINARY(84000) in length)
N = ISNULL(CAST(
(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1)*16+1
AS INT),0)
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
--===== Build the new table on-the-fly including some place holders
;WITH cteBuildPath AS
(
--=== This is the "anchor" part of the recursive CTE.
-- The only thing it does is load the Root Node.
SELECT
[ANCHOR].[TenantId],
[ANCHOR].[AttributeId],
[ANCHOR].[AttributeParentId],
HLevel = 1,
[SortPath] = CAST(
CAST([ANCHOR].[AttributeId] AS BINARY(16))
AS VARBINARY(8000)) --Up to 1000 levels deep.
FROM [dbo].[Attribute] AS anchor
WHERE [AttributeParentId] IS NULL --Only the Root Node has a NULL [ParentId]
UNION ALL
--==== This is the "recursive" part of the CTE that adds 1 for each level
-- and concatenates each level of [Id]'s to the [SortPath] column.
SELECT
[RECUR].[TenantId],
[RECUR].[AttributeId],
[RECUR].[AttributeParentId],
HLevel = cte.HLevel + 1,
[SortPath] = CAST( --This does the concatenation to build [SortPath]
cte.[SortPath] + CAST([RECUR].[AttributeId] AS BINARY(16))
AS VARBINARY(8000))
FROM [dbo].[Attribute] AS recur WITH (TABLOCK)
INNER JOIN cteBuildPath AS cte ON cte.[AttributeId] = [RECUR].[AttributeParentId]
)
--=== This final INSERT/SELECT creates the Node # in the same order as a
-- push-stack would. It also creates the final table with some
-- "reserved" columns on the fly. We'll leave the [SortPath] column in
-- place because we're still going to need it later.
-- The ISNULLs make NOT NULL columns
SELECT
[AttributeId] = ISNULL([SORTED].[AttributeId], '00000000-0000-0000-0000-000000000000'),
[SORTED].[AttributeParentId],
[HLevel] = ISNULL([SORTED].HLevel,0),
[LeftExtent] = ISNULL(CAST(0 AS INT),0), --Place holder
[RightExtent] = ISNULL(CAST(0 AS INT),0), --Place holder
[NodeNumber] = ROW_NUMBER() OVER (PARTITION BY [SORTED].[TenantId] ORDER BY [SORTED].[SortPath]),
[NodeCount] = ISNULL(CAST(0 AS INT),0), --Place holder
[SortPath] = ISNULL([SORTED].[SortPath],[SORTED].[SortPath])
INTO [#Hierarchy]
FROM cteBuildPath AS sorted
OPTION (MAXRECURSION 500) --Change this IF necessary
;
--===== Declare a working variable to hold the result of the calculation
-- of the LeftExtent so that it may be easily used to create the
-- RightExtent in a single scan of the final table.
DECLARE @LeftExtent INT
--===== Create the Nested Sets from the information available in the table
-- and in the following CTE. This uses the proprietary form of UPDATE
-- available in SQL Serrver for extra performance.
;WITH cteCountDownlines AS
(
--=== Count each occurance of [Id] in the sort path
SELECT
[AttributeId] = CAST(SUBSTRING(h.[SortPath],[T].N,16) AS UNIQUEIDENTIFIER),
[NodeCount] = COUNT(*) --Includes current node
FROM [#Hierarchy] h, [#HTally] t
WHERE [T].N BETWEEN 1 AND DATALENGTH([SortPath])
GROUP BY SUBSTRING(h.[SortPath],[T].N, 16)
) --=== Update the NodeCount and calculate both Bowers
UPDATE [SourceTable]
SET @LeftExtent = [SourceTable].[LeftExtent] = 2 * h.[NodeNumber] - h.[HLevel],
[SourceTable].[RightExtent] = ([downline].[NodeCount] - 1) * 2 + @LeftExtent + 1,
[SourceTable].[NodeCount] = [downline].[NodeCount],
[SourceTable].[HierarchyLevel] = h.[HLevel]
FROM [dbo].[Attribute] AS [SourceTable]
JOIN [#Hierarchy] h ON h.[AttributeId] = [SourceTable].[AttributeId]
JOIN cteCountDownlines downline ON h.[AttributeId] = [downline].[AttributeId]
WHERE NOT (
[SourceTable].[LeftExtent] = 2 * h.[NodeNumber] - h.[HLevel]
AND [SourceTable].[RightExtent] = ([downline].[NodeCount] - 1) * 2 + @LeftExtent + 1
AND [SourceTable].[NodeCount] = [downline].[NodeCount]
AND [SourceTable].[HierarchyLevel] = h.[HLevel]
)
;
RETURN 0
GO
PRINT N'Creating [dbo].[tr_Attribute_IUD]...';
GO
CREATE TRIGGER [dbo].[tr_Attribute_IUD]
ON [dbo].[Attribute]
FOR INSERT, UPDATE, DELETE
AS
BEGIN
SET NOCOUNT ON
IF UPDATE([AttributeId]) OR UPDATE([AttributeParentId]) BEGIN
EXEC [dbo].[iusp_CalculateNestedSets] @SchemaName = 'dbo', @TableName = 'Attribute', @GroupColumnName = 'TenantId', @IdColumnName = 'AttributeId', @ParentIdColumnName = 'AttributeParentId'
END
END
GO
/*
Post-Deployment Script Template
--------------------------------------------------------------------------------------
This file contains SQL statements that will be appended to the build script.
Use SQLCMD syntax to include a file in the post-deployment script.
Example: :r .\myfile.sql
Use SQLCMD syntax to reference a variable in the post-deployment script.
Example: :setvar TableName MyTable
SELECT * FROM [$(TableName)]
--------------------------------------------------------------------------------------
*/
DECLARE @ItemsPerHierarchyLevel INT = 2
DECLARE @HierarchyLevels INT = 3
-- ATTRIBUTE DATA
;WITH [Source]
AS
(
SELECT TOP (@ItemsPerHierarchyLevel)
ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS [RN]
FROM [sys].[syscolumns]
),
[CTE]
AS
(
SELECT
NEWID() AS [AttributeId],
CAST(NULL AS UNIQUEIDENTIFIER) AS [AttributeParentId],
CAST('Attribute-' + CAST([Source].[RN] AS NVARCHAR(128)) AS NVARCHAR(128)) AS [AttributeName],
1 AS [Level]
FROM [Source]
UNION ALL
SELECT
NEWID() AS [AttributeId],
[CTE].[AttributeId] AS [AttributeParentId],
CAST([CTE].[AttributeName] + '.' + CAST([Source].[RN] AS NVARCHAR(128)) AS NVARCHAR(128)) AS [AttributeName],
[CTE].[Level] + 1 AS [Level]
FROM [Source]
JOIN [CTE] ON [CTE].[Level] < @HierarchyLevels
)
INSERT INTO [dbo].[Attribute]([AttributeId], [AttributeParentId], [AttributeName])
SELECT
[CTE].[AttributeId],
[CTE].[AttributeParentId],
[CTE].[AttributeName]
FROM [CTE]
ORDER BY [AttributeName]
;
-- IMPACT DATA
;WITH [RN]
AS
(
SELECT TOP (24)
ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS [RN]
FROM [sys].[syscolumns]
),
[Date]
AS
(
SELECT DATEADD(MONTH, [RN].[RN] - 1, DATEFROMPARTS(2019, 1, 1)) AS [Date]
FROM [RN]
)
INSERT INTO [dbo].[FactImpact]([AttributeId], [Date], [PlanValue], [ActualValue])
SELECT
[Attribute].[AttributeId],
[Date].[Date],
1 AS [PlanValue],
1 AS [ActualValue]
FROM [dbo].[Attribute],
[dbo].[Attribute] [Attribute2], -- ONLY TO DUPLICATE DATA
[dbo].[Attribute] [Attribute3], -- ONLY TO DUPLICATE DATA
[Date]
;
GO
PRINT N'Update complete.';
GO
USE [NEW-DB-823474692347]
GO
Following two queries should return always the same result:
USE [NEW-DB-823474692347]
GO
SELECT
[vw_Attribute_Hierarchy].[AttributeRootId],
[vw_Attribute_Hierarchy].[AttributeRootName],
SUM([FactImpact].[PlanValue]) AS [SumFactImpacts]
FROM
[dbo].[FactImpact] INNER JOIN [vw_Attribute_Hierarchy] [vw_Attribute_Hierarchy] ON [vw_Attribute_Hierarchy].[AttributeNodeId] = [FactImpact].[AttributeId]
WHERE [vw_Attribute_Hierarchy].[AttributeNodeName] = 'Attribute-1.1.1'
GROUP BY [vw_Attribute_Hierarchy].[AttributeRootId]
, [vw_Attribute_Hierarchy].[AttributeRootName]
GO
SELECT
[vw_Attribute_Hierarchy].[AttributeRootId],
[vw_Attribute_Hierarchy].[AttributeRootName],
SUM([FactImpact].[PlanValue]) AS [SumFactImpacts]
FROM
-- SWITCH LEFT AND RIGHT TABLE ONLY
[vw_Attribute_Hierarchy] [vw_Attribute_Hierarchy] INNER JOIN [dbo].[FactImpact] ON [vw_Attribute_Hierarchy].[AttributeNodeId] = [FactImpact].[AttributeId]
WHERE [vw_Attribute_Hierarchy].[AttributeNodeName] = 'Attribute-1.1.1'
GROUP BY [vw_Attribute_Hierarchy].[AttributeRootId]
, [vw_Attribute_Hierarchy].[AttributeRootName]
GO
The strange thing is, if you do one of the following changes, both queries returns always the same correct data:
1) I toggled the tables in the INNER JOIN for the last two SELECT statements. If [FactImpact] is on the second place, the result is correct.
2) Remove the In-Memory option on [FactImpacts]
3) Remove the clustered column store index on [FactImpacts]
4) Reducing or increasing the amount of data in [FactImpacts]
DELETE [dbo].[FactImpact] WHERE [AttributeId] IN (SELECT [AttributeId] FROM [Attribute] WHERE [Attribute].[AttributeName] LIKE 'Attribute-2%')
5) Adding one more index on table [Attributes]
CREATE INDEX [IX_Attribute_Nested_Set] ON [dbo].[Attribute] ([LeftExtent], [RightExtent])
I hope you can follow my explanations!
I was able to reproduce this strange behavior locally in SQL Server 2016 / 2017.
Upvotes: 0
Views: 198