Reputation: 2440
Tables ranging from 1M - 200M records. All tables in question have zero indexes on them as they are staging/landing areas for our raw data sources. The purpose of having cast(NULL as varchar(max))
below was to handle 1 situation where our hashbyte function needed to handle a combination of varchar(255) columns that totaled up to over 8000 bytes, so a developer implemented that on our entire loading process across the board.
This hashbyte function is being used to identify when a record changes for incremental load.
I attempted to simulate this with small temp tables, but could not simulate this.
CAST (
HASHBYTES
(
'SHA2_256',
CONCAT
(
[CTE].[TransactionDetailTypeCode], --varchar(50)
'|',
[CTE].[TransactionDetailAmount], --decimal(19,4)
'|',
[CTE].[OriginalTransactionPostDate], --date
'|',
[CTE].[SourceSystemCode], --varchar(50)
'|',
[CTE].[EtlSourceTransaction], --varchar(500)
'|',
[CTE].[EtlSourceTransactionDetail], --varchar(500)
'|',
CAST(NULL AS VARCHAR(MAX)) --COMMENT THIS LINE OUT
)
)
AS varbinary(32)
) AS [EtlRowHashValue]
My findings are that the subtree cost with the code snippet included is over 22 times higher than having it commented out. I also noticed it forcing a sort for the merge join, which is what I'm suspecting is the cause of the abysmal performance. I'm assuming there's not an easy answer for this one, but out of curiosity,
Can someone please explain whether its the use of the HASHBYTE
, the CONCAT
or some other reason as to why including cast(NULL as varchar(max))
would cause a query plan to change this drastically?
Is this being caused because of some under-the-hood functionality that I'm perhaps just being ignorant about?
Upvotes: 2
Views: 79