Reputation: 103707

SQl Server tables: to heap or not to heap?

I have recently noticed that we have a number of tables stored in heaps (no clustered index). Would you create clustered indexes on them selectively, across the board, or not at all? Any other wisdom or advice?

There are some "codes" tables with 25 or so rows. However, there are several with well over a million rows.

EDIT of the "big tables", all of them already have indexes, just not clustered ones. a few are log tables, where they are just inserting, with little reading. There are a few that are quite important and are mostly just inserted into and then read a bunch of times by the application.

EDIT there are PK on all tables, with the few I'm interested in, they are mainly just inserted one time but read many times to display screens.

On some of these tables they are inserted in a a block or related rows at one time and read many times with no updates or the group is completely deleted and then reinserted as a block again. They are usually read in the some block to display or make calculations from.

On another "type" of these tables, the rows are repeatedly inserted in groups of related rows, with different groups inserting all the time. on screen display, the complete group will need to be returned. for example, over time these groups of rows are inserted (where a group could be 5-50 rows):

 1:00pm  A1, B1, C1,
 1:30pm  A2, B2, C2,
 2:00pm  A3, B3, C3, D1
 2:30pm  A4,     C4, D2
 3:00pm          C5, D3, E1
 3:30pm              D4, E2

screen would need to display complete set of A: A1+A2+A3+A4

EDIT Based on @gbn answer mentioning about fragmentation, I used this query from marc_s and found the following fragmentation info for the heap tables with million+ rows and that are read many times and used by screens:

TableName index_type alloc_unit_type index_depth index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count avg_page_space_used_in_percent record_count ghost_record_count Version_ghost_record_count min_record_size_in_bytes max_record_size_in_bytes avg_record_size_in_bytes forwarded_record_count
--------- ---------- --------------- ----------- ----------- ---------------------------- -------------- -------------------------- ---------- ------------------------------ ------------ ------------------ -------------------------- ------------------------ ------------------------ ------------------------ ----------------------
TABLE_A   HEAP       IN_ROW_DATA     1           0           95.8294717330862             2069           8.18511358144031           16935      98.2659995058068               1125786      3                  0                          80                       164                      117.671                  0
TABLE_A   HEAP       IN_ROW_DATA     1           0           95.8294717330862             2069           8.18511358144031           16935      98.2659995058068               1125786      3                  0                          80                       164                      117.671                  0
TABLE_A   HEAP       IN_ROW_DATA     1           0           95.8314034275127             2070           8.18212560386473           16937      98.2559303187546               1125793      11                 0                          80                       164                      117.672                  0
TABLE_B   HEAP       IN_ROW_DATA     1           0           99.2541594951233             1734           6.44982698961938           11184      94.5866567828021               1222729      0                  0                          68                       82                       68.037                   0
TABLE_B   HEAP       IN_ROW_DATA     1           0           99.2541594951233             1734           6.44982698961938           11184      94.5866567828021               1222729      0                  0                          68                       82                       68.037                   0
TABLE_B   HEAP       IN_ROW_DATA     1           0           99.197247706422              1735           6.44726224783862           11186      94.5725228564369               1222745      23                 0                          68                       82                       68.038                   0
TABLE_C   HEAP       IN_ROW_DATA     1           0           71.5785224061365             1777           10.9527293190771           19463      97.4122807017544               2237831      0                  0                          9                        84                       66.588                   2485
TABLE_C   HEAP       IN_ROW_DATA     1           0           71.5785224061365             1777           10.9527293190771           19463      97.4122807017544               2237831      0                  0                          9                        84                       66.588                   2485
TABLE_C   HEAP       IN_ROW_DATA     1           0           71.589991928975              1778           10.9476940382452           19465      97.4023844823326               2237832      0                  0                          9                        84                       66.588                   2485
TABLE_D   HEAP       IN_ROW_DATA     1           0           40.0769404842725             1773           19.7535250987028           35023      98.0193106004448               2778169      0                  0                          98                       112                      98.041                   0
TABLE_D   HEAP       IN_ROW_DATA     1           0           40.0904977375566             1774           19.7480270574972           35033      98.0175315048184               2778821      0                  0                          98                       112                      98.044                   0
TABLE_D   HEAP       IN_ROW_DATA     1           0           40.1040488577245             1775           19.7385915492958           35036      98.0142451198419               2778948      0                  0                          98                       112                      98.045                   0
TABLE_E   HEAP       IN_ROW_DATA     1           0           97.1619365609349             2911           8.11473720371007           23622      99.390066716086                3333693      0                  0                          55                       69                       55.017                   0
TABLE_E   HEAP       IN_ROW_DATA     1           0           97.1628838451268             2912           8.11332417582418           23626      99.3852359772671               3334016      0                  0                          55                       69                       55.018                   0
TABLE_E   HEAP       IN_ROW_DATA     1           0           97.1638304971638             2913           8.11122554067971           23628      99.3799357548802               3334100      0                  0                          55                       69                       55.018                   0
TABLE_F   HEAP       IN_ROW_DATA     1           0           21.9911471599199             8903           36.3093339323823           323262     94.6116753150482               4734053      44                 0                          521                      535                      521.046                  0
TABLE_F   HEAP       IN_ROW_DATA     1           0           21.9911471599199             8903           36.3093339323823           323262     94.6116876698789               4734053      50                 0                          521                      535                      521.046                  0
TABLE_F   HEAP       IN_ROW_DATA     1           0           21.9930761622156             8904           36.3057053009883           323266     94.6112428959723               4734079      78                 0                          521                      535                      521.047                  0
TABLE_G   HEAP       IN_ROW_DATA     1           0           66.1932151660993             5649           11.9943352805806           67756      96.7873733629849               6632610      0                  0                          78                       92                       78.047                   0
TABLE_G   HEAP       IN_ROW_DATA     1           0           66.1932151660993             5649           11.9943352805806           67756      96.7873733629849               6632610      0                  0                          78                       92                       78.047                   0
TABLE_G   HEAP       IN_ROW_DATA     1           0           66.1971830985916             5650           11.9925663716814           67758      96.7855572028663               6632648      11                 0                          78                       92                       78.048                   0
TABLE_H   HEAP       IN_ROW_DATA     1           0           11.5377268385864             5585           67.4340196956132           376619     92.3860637509266               6897347      0                  0                          9                        427                      406.418                  3
TABLE_H   HEAP       IN_ROW_DATA     1           0           11.5449915110357             5576           67.5530846484935           376676     92.3849023968372               6898289      0                  0                          9                        427                      406.419                  3
TABLE_H   HEAP       IN_ROW_DATA     1           0           11.5487458087518             5578           67.5313732520617           376690     92.3848035581913               6898534      0                  0                          9                        427                      406.42                   3
TABLE_I   HEAP       IN_ROW_DATA     1           0           96.7330677290837             9715           8.23201235203294           79974      96.3321225599209               3152049      0                  0                          76                       534                      195.879                  0
TABLE_I   HEAP       IN_ROW_DATA     1           0           96.7333930883378             9716           8.23157678056814           79978      96.3298122065728               3152142      0                  0                          76                       534                      195.879                  0
TABLE_I   HEAP       IN_ROW_DATA     1           0           96.7337183827923             9717           8.23114129875476           79982      96.3323696565357               3152420      0                  0                          76                       534                      195.876                  0
TABLE_J   HEAP       LOB_DATA        1           0           0                            NULL           NULL                       87553      95.5205090190264               7790594      0                  0                          84                       98                       84.91                    NULL
TABLE_J   HEAP       IN_ROW_DATA     1           0           31.2985438510012             23539          25.4966651089681           600166     96.4532863849765               7807684      0                  0                          435                      1213                     598.261                  0
TABLE_J   HEAP       IN_ROW_DATA     1           0           31.2994591137993             23540          25.4959218351742           600174     96.4530145787003               7807780      0                  0                          435                      1213                     598.26                   0
TABLE_J   HEAP       IN_ROW_DATA     1           0           31.3022047558782             23543          25.4936074417024           600196     96.4526068692859               7808096      0                  0                          435                      1213                     598.255                  0

I'm not sure why there are multiple rows for each table, but the avg_fragmentation_in_percent values look fairly high for almost all of these tables. Would that fragmentation be a performance issue when reading? would a clustered index be advised to defragment them?

Upvotes: 2

Answers (6)

johnnycrash

Reputation: 5344

One problem with a clustered index on large tables is that the buffer memory (RAM) required to store the index is equal to the size of the table. There is no separate index. A non clustered index stores only the data of the index and then the table's primary key or a refid. So a normal index can much more likely fit in RAM. If you are doing searches using the clustered index and the table is large then you could easily be slowing things down. If your clustered index is part of the date and your searches are all for recent dates, then maybe the clustered index won't hurt searching performance since you never access all of it.

I disagree with posters claiming the clustered index will reduce data fragmentation. It increases the data fragmentation. On a normal table, only deleting causes fragmentation. As you add rows to a clustered table, or change the a field of the clustered index, SQL has to physically reorder table. This means breaking and adding data pages which increases fragmentation. Thats why everyone recommends being careful to pick a field for clustering that a) doesnt change often if ever, and b) always increments.

I find that a clustered index is useful on large tables when your queries need to return multiple "related" rows often. You can use the cluster so the related rows are stored consecutively and easier for SQL to retrieve. I wouldn't necessarily cluster a large table so that I would have a new index for searching.

The one advantage that clustering does have, like a covering index, is that the index contains the data that query is trying to return. There is no extra step from the index to the table to get the data.

In the end, you have to get the profiler out, and run some tests.

Am I getting this right or missing something?

Upvotes: -1

gbn

Reputation: 432672

Add a clustered index always. Without a clustered index, you can not quickly compact or defrag the table. Without it, you can't.

Simplistic, but I bet some of the performance issues could be traced to badly organised data.

Upvotes: 3

Mladen Prajdic

Reputation: 15685

for large table a Clustered index is always a good idea. even for insert only table. of course your clustering key should be an ever increasing value.

Upvotes: 2

Frederik Gheysels

Reputation: 56984

I would consider to put a clustered index on the large tables. The clustered index defines the physical order of how the records are stored. The consequence of this, is that the rows in the table can be stored more efficiently, and reduce fragmentation. I'm sure there will at least be one column in the table which could be a candidate to put a clustered index on. (And if not, you could create a new column, which contains the date and time when the record has been created, and you put a clustered index on that column. I think that this is still better then no CI at all).

Edit: if the large tables are indeed log-tables, that aren't read frequently, then they can be left as a heap.

Upvotes: 0

Joel Coehoorn

Reputation: 416149

It depends on how the tables are used. Normally I want a clustered index on a table with millions of records, but you also need to consider how the table is used. Adding an index will slow down inserts because it has to lookup the proper page for each new record (and possibly insert a new page) rather than just append it. If these title are primarily "dumps" for data and are rarely checked (like emergency logging, for example), then leaving them alone might be better.

As always, you should profile to find out what works best for your application or system.

Upvotes: 0

Andomar

Reputation: 238296

Sounds like the database was created by someone who knew what he was doing. Log tables and small code tables are exactly where heaps make sense.

If there are no current problems with the database, I'd leave it as it is!

Upvotes: 5

SQl Server tables: to heap or not to heap?

Answers (6)

Related Questions