Reputation: 101
Database beginner here.
I've been researching column-store model for a while, I came up to this paper. It basically proposes special column-store model to sparse and huge clinical data for fast searching.
I generally understand what the paper proposes but some thoughts on relational model and EAV model confused me. The paper postulates that relational model cannot be even thought to use for clinical data and compares performances of EAV and column-store model.
EAV can represent high dimensional data, which cannot be modeled by relational model because existing RDBMS only support a limited number of columns.
As I understand from this question and its first answer, clinical data is high-dimensional (as expected) but
Why classical normalization is not used and EAV is used for that?
Relational model would be:
Person: Id, Name, Surname, DateOfBirth, ...
Measurement: Id, Name, Desc, ...
PersonMeasurement: Id, PersonId, MeasurementId, Result, Date
Upvotes: 0
Views: 2651
Reputation: 27424
...because existing RDBMS only support a limited number of columns.
This depends on the meaning given to the term "limited": a typical modern relational database has a number of columns that exceeds 1000. Here is a table with the maximum number of columns of some of the most common RDBMS:
PostgresQL: 250 - 1600 depending on column types
MySQL: upto 4096, depending on row size
Oracle: 1000
IBM DB2: 1012
Mycrosoft SQL Server: 1024 - 30000
Sybase: 45000
In your second link there is the following statement:
Data on health status of patients can be high-dimensional (100+ measured/recorded parameters from blood analysis, immune system status, genetic background, nutrition, alcohol- tobacco- drug-consuption, operations, treatments, diagnosed diseases, ...)
Edited
So, if the number of columns is in the order of the hundreds, I think that the single table approach is adequate to represent clinical data, and, as you can see from many questions on StackOverflow, more efficient and simple to query than the EAV model.
Upvotes: 1