ahmd0
ahmd0

Reputation: 17293

Seeking advice on how to structure the SQL Server 2008 DB table with large amount of data?

I am planning a web application (programmed using ASP.NET) that manages the database of logged events. The database will be managed in an SQL Server 2008. Each event may come from a set of, let's call them, "units." A user will be able to add and remove these "units" via the ASP.NET interface.

Each of the "units" can potentially log up to a million entries, or maybe even more. (The cut off will be administered via a date. For instance:

DELETE FROM [tbl] WHERE [date] < '01-01-2011'

The question I have is what is the best way to structure such database:

  1. By placing all entries for all "units" in a single table like this:

    CREATE TABLE tblLogCommon (id INT PRIMARY INDEX, 
                               idUnit INT, 
                               dtIn DATETIME2, dtOut DATETIME2, etc INT)
    
  2. Or, by separating tables for each "unit":

    CREATE TABLE tblLogUnit_1 (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
    CREATE TABLE tblLogUnit_2 (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
    CREATE TABLE tblLogUnit_3 (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
    --and so on
    CREATE TABLE tblLogUnit_N (id INT PRIMARY INDEX, dtIn DATETIME2, dtOut DATETIME2, etc INT)
    

Approach #1 seems simpler from a standpoint of referencing entries because with approach #2 I'll have to deal with variable N number of tables (as I said users will be allowed to add and remove "units.)

But approach #1 may render access to those log entries later very inefficient. I will have to generate reports from those logs via the ASP.NET interface.

So I'd like to hear your take on this before I begin coding?

EDIT: I didn't realize that the number of columns in a table makes a difference. My bad! The actual number of columns in a table is 16.

Upvotes: 1

Views: 221

Answers (4)

ahmd0
ahmd0

Reputation: 17293

I did some tests on the actual data with SQL Server 2008 Express, using local computer connection, no network latency. The computer this was tested on: Desktop, Windows 7 Ultimate, 64-bit, CPU: i7, @2.8GHZ, 4 cores; RAM: 8GB; HDD (OS): 1TB, 260GB free.

First all records were located in a "SINGLE" table (approach #1). All records were generated with random data. A complex SELECT statement processing each particular "unitID" was tried two times (one immediately after another), with CPU load: 12% to 16%, RAM load: 53% - 62%. Here's the outcome:

UnitID   NumRecords   Complex_SELECT_Timing
1        486,810      1m:26s / 1m:13s
3        1,538,800    1m:13s / 0m:51s
4        497,860      0m:30s / 0m:24s
5        497,860      1m:20s / 0m:50s

Then the same records were separated into four tables with identical structure (approach #2). I then ran the same SELECT statement two times as before, on the same PC, with identical CPU and RAM loads. Next are the results:

Table   NumRecords   Complex_SELECT_Timing
t1       486,810      0m:19s / 0m:12s
t3       1,538,800    0m:42s / 0m:38s
t4       497,860      0m:03s / 0m:01s
t5       497,860      0m:15s / 0m:12s

I thought to share this with whoever is interested. This pretty much gives your the answer...

Thanks everyone who contributed!

Upvotes: 0

paparazzo
paparazzo

Reputation: 45096

Splitting in separate tables is going to yield better insert and search speed.

With one table the difference is an index on idUnit. With that index search speed is going to be nearly as fast as separate tables (and you can search across idUnits is a single query). Where one table is going to take a hit is insert but that is a small hit.

Upvotes: 1

podiluska
podiluska

Reputation: 51494

A lot depends on how you intend to use this data. If you split the data into multiple tables, will you be querying over multiple tables, or will all your queries be within the defined date range. How often will data be inserted and updated.

In other words, there's no correct answer!

Also, can you afford a license for SQL enterprise in order to use partitioned tables?

Upvotes: 0

Adriaan Stander
Adriaan Stander

Reputation: 166336

I would go with approach 1, as the table does not seem very large(width wise) and yuo could apply indexes to improve searching/selecting.

Further to this, you could also look at partitioned tables and indexes.

Creating Partitioned Tables and Indexes

Upvotes: 3

Related Questions