Katai
Katai

Reputation: 2937

Dynamic content types: One table with many columns or one table for each?

You have a form editor and can add form fields (inputs). Each form field translates to a different type of data:

An easy way to store the structure of each:

enter image description here

Two solutions to store the data, if someone would fill such a dynamic form:

This problem applies to many other things: a template editor, a settings editor, etc.

How do we store the data for such dynamic forms?

I wouldn't consider using an additional non-relational database. I want to keep the data in one place.

Upvotes: 2

Views: 641

Answers (2)

PerformanceDBA
PerformanceDBA

Reputation: 33708

I'm asking about a "general" approach with this problem, not specifically for form editors

This thing keeps popping up again and again and everyone I know seems to have a different take on it - but none of them has a really "proper" solution

After the Relational requirement, which is of course the highest order directive, I will take generic approach and proper solution as over-arching directives.

1 Elevation to Relational

But first, a correction. In order to elevate the data model to Relational, without which the rather straight-forward solution would be exceedingly complex (which is precisely why [a] there are so many incorrect methods, and [b] none of them are proper solutions).

The great bulk of "literature" and textbooks that is promoted and marketed as "relational" is in fact anti-relational. Pre-1970 Record Filing Systems with relational labels, which is evidently all that the many authors and "theoreticians" can understand. The main difference between such pre-relational filing systems and the Relational Model is, the references are logical, not physical.

  • Whereas in pre-relational systems, physical records were related by physical pointers, such as ID (INTEGER; GUID; UUID; etc), in the Relational paradigm, logical rows (not records) are related by Relational Key.
  • A Relational Key is required to be made up from the data, which is logical, not physical (physical pointers are not data).

The Record IDs serve no purpose in a Relational database. Their purpose is to propagate what the "theoreticians" understand:

  • to confuse you
  • to maintain a pre-1970, pre-relational, physical mindset
  • and thus prevent a Relational mindset and database

At the least, each such Record ID will be one additional column and one additional index.

  • Additional, not instead of, because the indices on the logical data (Keys) are required for data Integrity, otherwise you will have duplicates and no integrity.

Therefore, your requirement elevated to Relational, with one less column and one less index in each table:

Katai 1

Note

2 Relational Key that is Required

The second thing that needs to be understood is the Intergrity Features of Relational Keys, which you will not have, due to the "theoreticians" being blissfully unaware of it. This is an explanation relevant to the Question, rather than a full tutorial. Assuming that you do want Data Integrity and Relational Integrity (the logical feature provided by the Relational Model (as distinct from Referential Integrity, which is a physical facility provided by SQL). Which we will need, in order to control (provide integrity to) the data structures that are required 'lower' in the data hierarchy, for "Storage".

  • While the above [1] is correct for that purpose, we have an additional requirement: we need the FieldType in the FormField.PK such that it can be migrated to its child tables.
  • But of course, we must not subtract from the logical FormField.PK above [1]. If we do, it would lose its integrity and meaning.
  • So we:
    • add the Relational Key that is required [2]
    • make it Primary Key because that is what is migrated
    • and we maintain the original Primary Key [1] by making it an Alternate Key
    • which changes the relation FieldType::FormField from Non-Identifying to Identifying.

Katai 2

3 Relational Solution • Subtype

Now the solution is straightforward.

I see two solutions to this [A] or [B]

Definitely [B], with a couple of improvements to bring it up to Standard.

[A] is an unnormalised mess that has no integrity, and it will be a maintenance nightmare. Nulls are very problematic, Nulls in a Foreign Key is suicide.

but [B] will require a lot more complexity when storing and fetching data, since the program has to know the type, use the right table, etc etc.

Storage

Not at all.

  • Bear with me, the solution is below
  • You clearly understand that the data ("storage") tables have to be typed (or cast)
  • It requires an ordinary Exclusive Subtype structure. (We have had Subtypes in DBMSs since the 1960's.)
  • I have provided that typing (casting) control in the Basetype
    • first, by inserting FieldType in the parent FormField Primary Key, so that it will be migrated
    • second, by using a Non-Identifying relation, so that it is not in the Key of the child UserField
    • but it remains non-key column, and it is available as a Discriminator for the Basetype
  • Now, it is possible to understand why [1][ and [2] had to be implemented first
  • See if you can appreciate the Relational Integrity (as distinct from Referential Integrity) that Relational Keys provide:
    • UserField is CONSTRAINED to FieldNames that exist in the applicable FormName
    • that level of logical data integrity, which is standard fare in Relational databases, cannot be had in the pre-relational physical (Record ID based) filing systems that the "theoreticians" fraudulently market as "relational"
  • Subtype Exposition for beginners

Further, you should be using ACID Transactions (a requirement for SQL Compliance since the first release), not INSERT/UPDATE/DELETE directly from the program. That would simplify both the app and the program code, as well as eliminate the otherwise thousands of errors. If implemented to OLTP Standard, it minimises locking and contention, and reduces, if not eliminates, deadlocks.

Retrieval

Not really. It may not be simple, it is definitely not complex, but it is straight-forward.

  • Use VIEW (this is a classic use of VIEWS)
    • for retrieving a known Subtype:
      one View for each Subtype
    • for retrieving an unknown Subtype:
      one View (with an UNION) for any Subtype (which will return just one for each Basetype, whichever Subtype it is)
      (This is not a "supertype", the anti-relational concept beloved of the "theoreticians", heaven forbid.)
  • That will make the SQL coding also straight-forward.

Katai 3

One More

Upvotes: 2

AntC
AntC

Reputation: 2806

can add form fields (inputs).

Let's remember the Relational Model is for representing 'structured information'.

If you can freely add (something) it sounds like you have unstructured information.

Where are these form fields coming from? Do they represent business entities, or attributes of entities? Then don't you already have those in a data model for the business? (Using 'business' in a very general sense.)

But the real question is: How do we properly store the data, ...

No, the real question is: how do we properly retrieve/query the data? What does the data represent? What questions are the business users going to ask about it?

If you haven't already modelled the business entities/attributes, then you won't know how or even whether you can answer the users' questions.

So if users are just going to throw random facts/fields at the database and expect to ask random questions about it, you don't have structured data, and a Relational Database is not the right tool for the job. That sounds more like NoSQL or some sort of document/tagged/XML/JSON database. (Although I'd dispute whether "database" is the right term for those.)

Upvotes: 2

Related Questions