Reputation: 1354
If a SQL Server 2012 table has 2 varchar(max)
columns that are rarely used and causing a table to grow rapidly, does it make sense to split into a separate table?
Data in the two varchar(max)
columns is used to store raw JSON responses from an API.
If stored in a separate table, rows would be truncated to only include previous 6 months, reducing table size of first table.
Upvotes: 2
Views: 1126
Reputation: 294487
If stored in a separate table, rows would be truncated to only include previous 6 months, reducing table size of first table.
The rows would have to be deleted, not truncated, and then the BLOB space would have to be reclaimed by running ALTER INDEX ... REORGANIZE WITH (LOB_COMPACTION = ON)
If you'd store instead the blobs in the original table, you would have to update the rows to SET blob = NULL
and then reclaim the space with ALTER INDEX ... REORGANIZE WITH (LOB_COMPACTION = ON)
So when it boils down to the details, you aren't achieving much using a split table, imho. So I stick to my earlier advice from SQL Server varbinary(max) and varchar(max) data in a separate table: I see no benefits in split, but I see trouble from having to maintain the rows consistent 1:1 between the splits.
You may have a case if you split and partition the 'blobs' table. Then you could, indeed, deallocate very efficiently the old space by switching 'out' the old partition and replacing them with an empty one, and then dropping the switched out data. That is something to consider. Of course, you code would have to be smart enough when it joins the two 'splits' to consider that the blobs may be gone (ee. use OUTER JOIN).
Upvotes: 4
Reputation: 3935
Just to give you a hint. You can use SPARSE
on a rarely used column.
Example:
CREATE TABLE myTable (
id int identity(1,1),
name nvarchar(100),
blob nvarchar(max) SPARSE,
blob2 nvarchar(max) SPARSE
)
Sparse will just left a small marker inside the page. But it's mostly bad practice to have nvarchar(max)
. Is it really needed?
You can read more about it here.
Upvotes: 0
Reputation: 1271171
This question is an almost-duplicate, but not quite.
To a large extent, the varchar(max)
columns are stored off-page. If you do not refer to them in a query, then they are not loaded. So, they provide little additional overhead.
The situation is different if the values are stored on the same page. Then removing them can shrink the size of your table and fewer pages should be faster for select
queries. Here is an interesting blog on this subject.
You have an additional constraint, in terms of truncating the table. That would seem like a win for many queries, so I think that tips the argument toward putting the values in a separate table.
Note, however, that truncation is not the only possible solution. You could also partition the data (say by month or week). If the partitioning key is part of all queries, then you can use partitioning instead of truncation and also reduce the effective size of the table.
Upvotes: 0