Wim ten Brink
Wim ten Brink

Reputation: 26682

Optimizing a table with a huge text-field

I have a project which generates snapshots of a database, converts it to XML and then stores the XML inside a separate database. Unfortunately, these snapshots are becoming huge files, and are now about 10 megabytes each. Fortunately, I only have to store them for about a month before they can be discarded again but still, a month of snapshots turn out to become real bad for it's performance...

I think there is a way to improve performance a lot. No, not by storing the XML in a separate folder somewhere, because I don't have write access to any location on that server. The XML must stay within the database. But somehow, the field [Content] might be optimized somehow so things will speed up...
I won't need any full-text search options on this field. I will never do any searching based on this field. So perhaps by disabling this field for search instructions or whatever?

The table has no references to other tables, but the structure is fixed. I cannot rename things, or change the field types. So I wonder if optimizations is still possible.
Well, is it?


The structure, as generated by SQL Server:

CREATE TABLE [dbo].[Snapshots](
    [Identity] [int] IDENTITY(1,1) NOT NULL,
    [Header] [varchar](64) NOT NULL,
    [Machine] [varchar](64) NOT NULL,
    [User] [varchar](64) NOT NULL,
    [Timestamp] [datetime] NOT NULL,
    [Comment] [text] NOT NULL,
    [Content] [text] NOT NULL,
 CONSTRAINT [PK_SnapshotLog] 
    PRIMARY KEY CLUSTERED ([Identity] ASC) 
    WITH (PAD_INDEX  = OFF, 
    STATISTICS_NORECOMPUTE  = OFF, 
    IGNORE_DUP_KEY = OFF, 
    ALLOW_ROW_LOCKS  = ON, 
    ALLOW_PAGE_LOCKS  = ON, 
    FILLFACTOR = 90) ON [PRIMARY],
 CONSTRAINT [IX_SnapshotLog_Header] 
    UNIQUE NONCLUSTERED ([Header] ASC) 
    WITH (PAD_INDEX  = OFF, 
    STATISTICS_NORECOMPUTE  = OFF, 
    IGNORE_DUP_KEY = OFF, 
    ALLOW_ROW_LOCKS  = ON, 
    ALLOW_PAGE_LOCKS  = ON, 
    FILLFACTOR = 90) 
    ON [PRIMARY],
 CONSTRAINT [IX_SnapshotLog_Timestamp] 
    UNIQUE NONCLUSTERED ([Timestamp] ASC)
    WITH (PAD_INDEX = OFF, 
    STATISTICS_NORECOMPUTE = OFF, 
    IGNORE_DUP_KEY = OFF, 
    ALLOW_ROW_LOCKS = ON, 
    ALLOW_PAGE_LOCKS = ON, 
    FILLFACTOR = 90) 
    ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]


Performance isn't just slow when selecting data from this table but also when selecting or inserting data in one of the other tables in this database! When I delete all records from this table, the whole system is fast. When I start adding snapshots, performance starts to decrease. After about 30 snapshots, performance becomes bad and the risk of connection timeouts increase.
Maybe the problem isn't in the database itself, although it's still slow when used through the management tool. (Fast when Snapshots is empty.) I mainly use ASP.NET 3.5 and the Entity Framework to connect to this database and then read the multiple tables. Maybe some performance can be gained here, although that wouldn't explain why the database is also slow from the management tools and when used through other applications with a direct connection...

Upvotes: 2

Views: 305

Answers (3)

Wim ten Brink
Wim ten Brink

Reputation: 26682

The whole system became a lot faster when I replaced the TEXT datatype with the NVARCHAR(MAX) datatype. HLGEM pointed out to me that the TEXT datatype is outdated, thus troublesome. It's still a question if the datatype of these columns could be replaced this easy with the more modern datatype, though. (Translated: I need to test if the code will work with the altered datatype...)
So, if i would alter the datatype from TEXT to NVARCHAR(MAX), is there anything that would break because of this? Problems that I can expect?
Right now, this seems to solve the problem but I need to do some lobbying before I'm allowed to make this change. So I need to be real sure it won't cause any (unexpected) problems.

Upvotes: 0

amit_g
amit_g

Reputation: 31250

The table is in PRIMARY filegroup. Could you move this table to a different filegroup or even that is constrained? If you can, you should move it to a different filegroup with its own physical file. That should help a lot. Check out how create new filegroup and move the object to a new file group.

Upvotes: 3

David Waters
David Waters

Reputation: 12028

Given your constraints you could try zipping the XML before inserting into the DB as binary. This should significantly reduce the storage cost of this data.

You mention this is bad for performance, how often are you reading from this snapshot table? If this is just stored it should only effect performance when writing. If you are often reading this are you sure the performance issue is with the datastoreage not the parsing of 10MB of XML?

Upvotes: 2

Related Questions