Reputation: 2404
I have a number of different message types (let's say 20) that I need to SELECT based on their date, and their type, in something along the following manner
i.e. ... WHERE date BETWEEN [fromDate] AND [toDate] AND type = [0 - 20 different types]
The different types of messages have very few columns in common (date being the most important), but I will always need to fetch all kinds of comments "in one go", ordered by date. Messages have a self reference to allow threaded conversations. A message will always be of one type, and only one type.
Because I have 5.000.000 messages in the archive, and rarely more than 50 messages in a conversation, I need to be able to efficiently select by date or a conversation identifier.
Therefore I have a single "mother of all messages"-table and multiple extra tables which have a 1 .. 0-1
relationship to the messages table:
messages: [id, date, parent_id (nullable), ... ]
msgs_type1: [col1, col2, col3, col4, ...]
msgs_type2: [col1, col2, col3, col4, ...]
And then here's my question: How do you normally specify relationships between these types of tables? What are the (dis)advantages of joining the tables in e.g. the following ways:
messages: [id, date, parent_id (null), **msg_type_1 (null), msg_type_2 (null)**, ...]
msgs_type1: [col1, col2, col3, col4]
msgs_type2: [col1, col2, col3, col4]
...
(Optional relationships specified in messages)
messages: [id, date, **type**, parent_id (null)]
msgs_type1: [**message_id**, col1, col2, col3, col4]
msgs_type2: [**message_id**, col1, col2, col3, col4]
...
(Mandatory relationships specified in msgs_type tables, lookup table specified in message)
On one hans, it feels dirty to have 20 optional colums, of which (only) one column must have a value in order to specify the type of message.
On the other hand, having a "type" enum column instead, and using it to manually infer what table to look for additional info in also feels wrong - and will probably cause a great deal of pain to work with in most ORMs.
So what does the book say about these types of structures? What about the day I have 200 different types of messages?
Upvotes: 1
Views: 147
Reputation: 1087
IMHO: Any time you are in a position where you will have to alter your database because you've added a new "type" of "something" you are, as they say, doing it wrong. The only time I would be comfortable with this type of column-oriented table is if it were done just before, say, a report was being generated. Or, perhaps, done at the end of a process to ease work for non-technical users who might want to generate their own queries.
A properly-indexed and normalized table structure with 5 to 10 million rows should still perform just fine.
Upvotes: 2