Reputation: 145
As soon as entities lose their attribute-invariance through time (maybe independently, e.g. only some attribute then again only one attribute), is the only way to correctly normalize with 6NF?
3NF is a special form of 6NF.
Upvotes: -2
Views: 199
Reputation: 13
The point of database normalization is redundancy elimination to reduce storage consumption and ensure the consistency of data.
Codd also defined the objectives beyond normalizing to 1NF as follows:
- To free the collection of relations from undesirable insertion, update and deletion dependencies.
- To reduce the need for restructuring the collection of relations, as new types of data are introduced, and thus increase the life span of application programs.
- To make the relational model more informative to users.
- To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.
— E.F. Codd, "Further Normalisation of the Data Base Relational Model", p. 34
By that definition, 6NF is the only formally correct normalization for time-variant data. It can also make comparison of data easier, which becomes clear when comparing historization with 5NF and 6NF tables.
Regardless of 5NF or 6NF, historization is always done at row the level. Only in 6NF do the rows behave the same as columns. For 5NF, this means the entire row with all its columns is historized, which already violates objective 1, as the redundantly historized data can be considered an undesired insert dependency. Depending on how much data has been changed and how large the table is, this can lead to a lot of wasted space through redundant entries and makes differential temporal calculations harder as you now have to compare entire rows in a 5NF table than just one single (attribute) row in a 6NF table. If you want to get the actual times of state changes of a single attribute in a 5NF temporal table, you have to manually prune all of the temporal restatements (A -> B -> [B -> B ->] C
). With a 6NF table, you only have restatements if you allowed/introduced them yourself and they can be avoided via triggers to only record changed data, which helps you to better fulfill objective 2 and 3. If you design for it, you can also get rid of NULL values entirely by designing the unknown as an explicit value.
6NF does not come without its challenges though and should in large scale only be done within a framework, as it leads to an explosion of tables.
If you don't want to go the 6NF route, and only certain attributes have high update frequencies, it can make sense to move only them out of the larger 5NF table to their own 6NF tables and then join them back in via foreign keys to avoid an overly redundant historization of data. But as the demands on the data and its historization change over time, another such restructuring could become necessary and objective 4 could be failed over and over again until all time-variant data is eventually moved to separete 6NF tables.
Upvotes: -1
Reputation: 25524
As soon as entities loose their attribute-invariance through time,( maybe independently e.g only some attribute then again only one attribute), the only way to correctly normalize is with 6NF.
6NF isn't the only way. I suspect that 5NF is much more commonly used than 6NF as a method of recording history-preserving data - by making a timestamp or time range part of the keys. 6NF has the advantage of not requiring you to repeat every attribute each time one attribute changes. That's not a kind of redundancy that 5NF is concerned with, so a 5NF model of time-varying data is still "correct" according to 5NF - it's just that it may preserve more dependencies than 6NF does.
Upvotes: 3