Tiborg
Tiborg

Reputation: 2305

Choosing SQL Server data types for maximum speed

I'm designing a database that will need to be optimized for maximum speed.

All the database data is generated once from something I call an input database (which holds the data I'm editing, mainly some polylines, markers, etc for google maps).

So the database is not subject to editing, but it needs to hold as many data as it can for quickly displaying results to the user (routes across town, custom polylines, etc).

The question is: choosing smaller data types for example like smallint over int will improve performance or it will affect it? Space is not quite a problem, after some quick calculations, the database will not exceed 200mb, and there will not be tables with more than 100.000 rows (average will be around 5.000).

I'm asking this because I read some articles around the internet and some say that smaller data types improve performance others say that it affects it because additional processing must be done. I'm aware that for smaller databases probably results are not noticeable, but I'm interested in every bit because I'm expecting many requests which will trigger a lot more queries.

The hosting environment is gonna be Windows Server 2008 R2 with SQL Server 2008 R2.

EDIT 1: Just to give you an example because I don't have a proper table structure yet: I'm going to have a table which will hold public transportation lines (somewhere around 200), identified by a unique number in real life, and which is going to be referenced in all sorts of tables and on which all sorts of operations are going to be made. These referencing tables will hold the largest amount of data.

Because lines have unique numbers, I have thought of 3 examples of designs:

  1. The PK is the line number of datatype: smallint

  2. The PK is the line number of datatype: int

  3. The PK is something different (identity for example) and the line number is stored in a different field.

  4. Just for the sake of argument, because I used this on the 'input database' which is not subject to optimization, the PK is a GUID (16 bytes); if you like, you can make a comparison of how bad is this compared to others, if it really is

So keep in mind that the PK is going to be referenced in at least 15 tables, some of which will have over 50.000 rows (the rest averaging 5.000 as I said above) which are going to be subject to constant querying and manipulation, and I'm interested in every bit of speed that I can get.

I can detail this even more if you need. Thanks

EDIT 2: And another question related to this came to my mind, think it fits into this discussion:

Will I see any performance improvements in this specific scenario if I use native SQL queries from inside my .NET application rather than using LINQ to SQL? I know LINQ is strongly optimized and generates very good queries performance-wise, but still, sure worth asking. Thanks again.

Upvotes: 4

Views: 4412

Answers (3)

Ashley
Ashley

Reputation: 1

One suggestion that I have is to incorporate a decimal datatype instead of using a combination of fields. For example, instead of having a table with Date (YYYYMMDD), Store (SSSS), and Item (IIII), I would recommend...YYYYMMDD.SSSSIIII. Especially when querying multiple tables with this same key combination, it dramatically improves processing time.

Upvotes: -2

Aaron Bertrand
Aaron Bertrand

Reputation: 280570

Can you point to some articles that say that smaller data types = more processing? Keeping in mind that even with SSDs most workloads today are I/O-bound (or memory-bound) and not CPU-bound.

Particularly in cases where the PK is going to be referenced in many tables, it will be beneficial to use the smallest data type possible. In this case if that's a SMALLINT then that's what I would use (though you say there are about 200 values, so theoretically you could use TINYINT which is half the size and supports 0-255). Where you need to exercise caution is if you aren't 100% sure that there will always be ~200 values. Once you need 256 you're going to have to change the data type in all of the affected tables, and this is going to be a pain. So sometimes a trade-off is made between accommodating future growth and squeezing the absolute most performance today. If you don't know for certain that you will never exceed 255 or 32,000 values then I would probably just an INT. Unless you also don't know that you won't ever exceed 2 billion values, in which case you would use BIGINT.

The difference between INT/SMALLINT/TINYINT is going to be more noticeable in disk space than in performance. (And if you're on Enterprise, the differences in both disk space and performance can be offset quite a bit using data compression - particularly while your INT values all fit within SMALLINT/TINYINT, though in the latter case it really will be negligible because the values are unique.) On the other hand, the difference between any of these and GUID is going to be much more noticeable in both performance and disk space. Marc gave some great links from Kimberly; I wrote this article in 2003 and while it's a little dated it does contain most of the salient points that are still relevant today.

Another trade-off that sometimes needs to be considered (though not in your specific case, it seems) is whether values need to be unique across multiple systems. This is where you might need to sacrifice some performance in order to meet business requirements. In a lot of cases folks take the easy way and resign themselves to GUID. But there are other solutions too, such as identity ranges, a central custom sequence generator, and the new SEQUENCE object in SQL Server 2012. I wrote about SEQUENCE back in 2010 when the first public beta of SQL Server 2012 was released.

Upvotes: 4

Sergey Rybalkin
Sergey Rybalkin

Reputation: 3026

I think you will need to provide some more details about the tables structure and sample queries that will be running against them. Based on the information that you have provided I believe that impact of choosing smaller data types will be just a couple of percents and I would suggest to give higher attention to indexes that you will have. SQL Server does a good job on suggesting what indexes to create by providing you with execution plans for your queries and tuning advisor tool

Upvotes: 0

Related Questions