smok
smok

Reputation: 355

Large db table with many rows or many columns

I have tried to have a normalized table design. The problem (maybe) is that we are generating a lot of data, and therefore a lot of rows. Currently the database is increasing in size by 0,25 GB per day.

The main tables are Samples an Boxes. There's a one-to-many relation from Samples to Boxes. Sample table:

ID | Timestamp | CamId 

Boxes table:

ID | SampleID | Volume | ...

We analyse 19 samples each 5 seconds, and each sample on avg has 7 boxes. That's 19*7*12 = 1596 boxes each minute and 1596*60*24 = 2,298,240 rows in Boxes table each day on avg.

This setup might run for months. At this time the Boxes table has about 25 million rows.

Quistion is; should i be worried about database size, table size and table design with so much data?

Or should I have a table like

ID | SampleID | CamId | Volume1 | Volume2 | ... | Volume9 | ...

Upvotes: 0

Views: 1125

Answers (3)

Lostblue
Lostblue

Reputation: 419

Depending on the validity of your data, you could implement a purge of your data. What I mean is: do you really need data from days ago, months ago, years ago? If you have a time limit of use for your data, purge them and your data table should stop growing (or likely) after a set amount of time.

This way you wouldn't need to care that much about either architecture for the sake of size.

Otherwise the answer is yes, you should care. Separate notions in a lot of tables could give you a good tweak on performance but maybe won't be sufficient in terms of access time after a long time. Consider looking at NoSQL solutions or alike in order to store heavy rows.

Upvotes: 1

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67291

There is one simple rule: Whenever you think you have to put a number to a column's name you probably need a related table.

The amount of data will be roughly the same, no wins here.

I'd try to partition the table. AFAIK this feature was bound to the Enterprise Editions, but - according to this document - with SQL Server 2016 SP1 table and index partitioning is coming down even to Express!

The main question is: What are you going to do with this data?

If you have to run analytical scripts over everything, there won't be a much better hint than buy better hardware.
If your needs refer to data of the last 3 weeks you will be fine off with partitioning.

If you cannot use this feature yet (due to your Server's version), you can create an archive table and move older data into this table in regular jobs. A UNION ALL view would still allow to grab the whole lot. With SCHEMA BINDING you might even get the advantages of indexed views.

In this case it is clever, to hold your working data in your fastest drive and put the archive table in a separate file on a large storage somewhere else.

Upvotes: 1

Vijunav Vastivch
Vijunav Vastivch

Reputation: 4191

Question is, should i be worried about database size, table size and table design with so much data?

My answer is YES:

1. A huge amount of data(daily) should affect your storage in hardware part.
2. Table normalized is a must mostly if you are storing bytes or images.

Upvotes: 0

Related Questions