Reputation: 1019
We have a SQL Server database which is used in SQL Server 2008 R2 and SQL Server 2019 instances.
It has around 60 tables with nullable ntext
columns. Out of those 60, 5 have content in that column. They have around 2k rows.
My questions are related to resources used/saved by switching to nvarchar
(deprecation of ntext
is not an argument in this case):
Will and why would converting ntext
column with all null values to nvarchar(max)
improve performance or reduce disk space used even theoretically?
Will and why would converting ntext
column with non null values to nvarchar(max)
improve performance or reduce disk space used even theoretically? Regarding performance this column is not used in any string manipulation or parsing (LEFT, LEN...)
Would answers to first 2 questions be any different if we were converting to nvarchar(250)
instead of nvarchar(max)
.
Among other answers and articles the closest to answer is: SQL Server, Converting NTEXT to NVARCHAR(MAX) which confirms that there are significant gains in the case of hundreds of fields and millions of rows which is far from my situation and more importantly does not answer what makes these gains possible?
Upvotes: 0
Views: 760
Reputation: 32609
There are numerous advantages to using (n)varchar(max) over (n)text.
As already pointed out for using nvarchar over ntext, with nvarchar, the data is stored in-row if it will fit and only stored off-row (with a pointer and requiring an additional lookup) when necessary; with ntext, data is always stored off-row regardless of actual data stored.
In addition to this, an often overlooked aspect of performance is the statistics and metadata usage by the query optimizer to generate an execution plan.
So specifically for your 3rd point, absolutely yes you will see an advantage.
When building an execution plan, SQL Server has to make some educated guesses and this requires knowing how much data is likely to be retrieved by a query. Part of that is driven by what it knows from the cardinality and statistics, and part is driven by what you have told SQL Server with the data types of the columns being queried.
For varchar data, SQL Server makes an estimate that roughly 50% of the data space will be used, so with nvarchar(250) it would guess each row returned would be 125 characters (plus a little overhead) and with nvarchar(max) (I believe) it assumes 4k.
This matters because SQL Server will use this to determine how much memory is allocated to execute the query (it's pre-determined and part of the execution plan), so by using (max) unecessarily you'll likely have unecessarily large memory grants which can be particularly harmful to concurrency.
For example, with several concurrent executions of a query that has overly large memory grants, other concurrent queries might be granted less than ideal memory which can cause operations such as hashes or sorts to spill to tempdb and increase IO costs.
The upshot is that the closer you can be to the actual real-world required maximum length, the better your server resources will be allocated.
SQL Server 2019 introduces Adaptive memory grants which can improve its memory usage estimates over repeated execution of a cached plan, but introduces other issues also.
Another point is the use of nvarchar over varchar - if you need the ability to store unicode data then of course you have no choice, but if you don't, varchar requires 50% less storage - both on disk and in ram - and subsequently requires a comparatively smaller memory grant too.
Upvotes: 1
Reputation: 415810
If most of the rows (including the contents of the ntext
fields) are smaller than 8kb, you improve performance by allowing the data to store directly with the table rather than in it's own page. That saves you one lookup almost every time you load one of these values.
This will also give you the ability to use the string functions on this field that would otherwise not work.
If most of the rows are > 8kb and you don't care about the string functions, you should still do the conversion... but I understanding wanting to save limited development resources for more valuable work.
Upvotes: 2