Reputation: 1095
I need to store 96 data per day (every 15 minutes 1 data). There are two obvious solutions I see but I do not know which one to choose. I can store each data by row (=96 records+IDs) or by column (=1record with 96 cols + ID). This is an important question as it is a root design question of my database.
I wonder which one will be faster when I make queries and joins (considering that indexes are created properly)? To store all data in a single record or in multiple records?
Handling 96 cols and doing operation with them (multipling one record with another or summarizing a full day) is quite a pain. However it makes the reading of the data by human (=developers) bit easier.
Does anyone have experience on the subject?
Upvotes: 1
Views: 106
Reputation: 96552
I don't know what you want to do with this data, but suppose you will want to sum all the data for a particular day. If you use the relational table the query is:
select sum(field1)
from table1 t
where t.date = '20141213'
but if you wanted to do the same for the non-relational design, you would have to write
select field1+field2+field3+...field96 from table1
if you needed other aggregates the code gets worse:
select Count(field1)
from
(select field1 from table1 where date = '20141213' and field1 is not null
union all
select field2 from table1 where date = '20141213' and field2 is not null
union all
select field3 from table1 where date = '20141213' and field3 is not null
... ( put in a separate stement for each field)
union all
select field96 from table1 where date = '20141213' and field96 is not null
) a
And if you later need to insert twice as often, you would need to add 96 more columns and fix all the code written against it. And depending on how big the data in each column is, you could run into the record limits for an individual record.
Looking at this, I don't think the 96 columns is a good idea.
Upvotes: 2
Reputation: 166
We don't have enough information to make decisions about the right design.
Before you make any decision, spend more time with the data that's going to live in the database and identify any relationships that exist between the different values. Do you know how the data will be used once stored? Talk to the developers, but don't design in with the only goal being to make it easy for the devs to work with.
There is nothing wrong with only having a single table, but the problems you describe make me think that it isn't the best solution.
And one last note, if "By Column" you mean the entity-attribute-value model - that design should be reserved for environments where the number and types of values that you need to store are going to change. It's inefficient for querying. I'm comfortable saying that you should stay away from that design even with the limited info here.
Upvotes: 1
Reputation: 2607
I would recommend the first solution: 96 records (+IDs) / day since it is a data capture process that is recurrent and probably the data is similar ( in other words the attribute that is valued in the 15 min base does not change)
Pros:
- If the data capture interval increases or decreases then you don't need to alter the table structure.
- If you later need to add other attributes (such as timestamp of the capture, who captured data etc) then it is easier to work with.
- 96 rows per day is not that big amount of data (about 35k per year) so if you have a clustered index on the ID column then queries cost shouldn't be too high, even in the long term.
Upvotes: 1
Reputation: 2860
Can you form a relationship based on 1 column with 96 records?
There is a huge design flaw here. Remember SQL Server is a relational database. If a relation is not what you need here, then you're only making things more difficult for humans in the long run. Your solution should be scale-able.
You cannot normalize a column based data storage structure which takes away from the performance functionality the database inherently utilizes. Also, how can you use indexes?
I think a firm understanding of the relationships should first be understood.
Also, you can get the data you want out of those columns, but sometimes pivot/unpivot functions are costly to the system and you should definitely ask your developers how they would like this data represented. As a developer myself, if a standard query doesn't work and this needs to be queried regularly, this could pose a lot of issues that may not be easy to fix in the long run.
Upvotes: 0