Reputation: 7407
I just learned about the wonders of columnstore indexes and how you can "Use the columnstore index to achieve up to 10x query performance gains over traditional row-oriented storage, and up to 7x data compression over the uncompressed data size."
With such sizable performance gains, is there really any reason to NOT use them?
Upvotes: 9
Views: 3504
Reputation: 29
Hello A very detailed explanation of columns store index can be found here.
ColumnStore Index
A columnstore index is a technology for storing, retrieving and managing data by using a columnar data format, called a columnstore.
This feature has been introduced with SQL Server 2012 which intends to significantly speed-up the processing time of common data warehousing queries. The main objectives of columnstore indexes is appropriate for typical data warehousing data sets and improve the performance of the query whenever data is pulled from the huge datasets.
They are column based indexes which are capable to transform the data warehousing experience for users by enabling faster performance for common data warehousing queries such as filtering, aggregating, grouping and star-join queries. They store the data column-wise instead of row-wise, as indexes currently do.
Upvotes: 2
Reputation: 5190
Columnstore Indexes is especially beneficial for DataWarehousing (DW). Meaning that you will only perform updates or deletes at certain times.
This is due to their special design with delta loading and more features. This video will show great detail and a nice basic overview of what the exact difference is Columnstore Index.
If you however have a high I/O (input and output) of the application; Columnstore Index is not ideal since traditional row indexing will find and manipulate (using the rows found through the index) on that specific target. An example of this would be a ATM application which frequently changes the values of the rows of the given persons accounts.
Columnstore Indexing indexes throughout the COLUMNS which is not ideal in this case since the row values will be spread throughout the segmentations (columnsindexes).
I highly recommend the video!
I want to also elaborate on the non-clustered vs clustered columnstore:
Non-clustered Columnstore (update in 2012) saves the WHOLE data again meaning (2X data) twice the data.
Where as Clustered Columnstore index (update in 2014) only takes up 5MB for about 16GB of data. This is due to the RTE (runtime encoding) which saves the amount of duplicate data in each column. Making the index take up less extra storage.
Upvotes: 8
Reputation: 171178
The main disadvantage is that you'll have a hard time reading only a part of the index if the query contains a selective predicate. There are ways to do it (partitioning, segment elimination) but those are neither particularly easy to reliably implement nor do they scale to complex requirements.
For scan-only workloads columnstore indexes are pretty much ideal.
Upvotes: 8